3. Data, Science & AI

24 skills

Found 9989 skills

Total Stars:6.7M
Avg Stars:667

pymc-bayesian-modeling

davila7

18.0K

Enables Bayesian inference with PyMC, including hierarchical models, MCMC (NUTS), and model comparison techniques for probabilistic programming.

PyMC
Bayesian Inference
MCMC
3. Data, Science & AI

citation-management

davila7

18.0K

Comprehensive citation management tool for academic research, enabling search, metadata extraction, citation validation, and BibTeX generation from sources like Google Scholar and PubMed.

Google Scholar
PubMed
BibTeX
3. Data, Science & AI

tensorboard

davila7

18.0K

Visualizes ML training metrics, model graphs, and performance for debugging and experiment comparison using Google's TensorBoard.

TensorBoard
ML Visualization
Model Debugging
3. Data, Science & AI

huggingface-accelerate

davila7

18.0K

Unifies distributed training frameworks for PyTorch with minimal code changes, automatic device placement, and mixed precision support.

PyTorch
Distributed Training
HuggingFace
3. Data, Science & AI

datamol

davila7

18.0K

Simplifies RDKit for drug discovery tasks including SMILES parsing, molecular descriptors, and 3D conformer generation.

RDKit
SMILES
Molecular Descriptors
3. Data, Science & AI

knowledge-distillation

davila7

18.0K

Compresses large language models via knowledge distillation, retaining performance while reducing inference costs. Supports soft targets and logit distillation techniques.

Knowledge Distillation
Model Compression
Logit Distillation
3. Data, Science & AI

neuropixels-analysis

davila7

18.0K

Analyzes Neuropixels neural recordings including preprocessing, spike sorting, and AI-assisted quality assessment for extracellular electrophysiology data.

Neuropixels
Kilosort
Spike Sorting
3. Data, Science & AI

protocolsio-integration

davila7

18.0K

Integrates with protocols.io API to manage scientific protocols, including creation, updates, collaboration, and documentation for lab and research workflows.

protocols.io
protocol management
3. Data, Science & AI

histolab

davila7

18.0K

Toolkit for processing whole slide pathology images, including tile extraction, tissue segmentation, and dataset preparation for deep learning in computational pathology.

Whole Slide Imaging
Tissue Segmentation
Deep Learning
3. Data, Science & AI

lamindb

davila7

18.0K

Manages biological datasets with FAIR principles, ensuring traceability, reproducibility, and integration with scientific workflows and MLOps platforms.

LaminDB
FAIR
Biological ontologies
3. Data, Science & AI

scvi-tools

davila7

18.0K

Enables probabilistic modeling and analysis of single-cell omics data, including scRNA-seq, scATAC-seq, and spatial transcriptomics, for tasks like batch correction and cell type annotation.

scvi-tools
single-cell
omics
3. Data, Science & AI

pyopenms

davila7

18.0K

Python interface for mass spectrometry data analysis, enabling proteomics and metabolomics workflows with file handling and quantitative processing.

OpenMS
Proteomics
Metabolomics
3. Data, Science & AI

pytorch-lightning

davila7

18.0K

High-level PyTorch framework simplifying training loops with distributed computing, callbacks, and minimal boilerplate for scalable machine learning development.

PyTorch Lightning
Distributed Training
Trainer
3. Data, Science & AI

neurokit2

davila7

18.0K

Comprehensive toolkit for processing and analyzing physiological signals including EEG, ECG, and EDA for scientific research and clinical applications.

Biosignal Processing
EEG
3. Data, Science & AI

llama-cpp

davila7

18.0K

Runs LLM inference on CPU, Apple Silicon, and consumer GPUs without NVIDIA hardware using GGUF quantization for efficiency.

LLM
GGUF
Edge
3. Data, Science & AI

gget

davila7

18.0K

Provides rapid bioinformatics queries via CLI and Python, accessing multiple biological databases for sequence analysis and research.

BLAST
Bioinformatics
3. Data, Science & AI

pytorch-fsdp

davila7

18.0K

Expert guidance for PyTorch FSDP training with parameter sharding, mixed precision, and CPU offloading in distributed deep learning.

PyTorch
FSDP
Mixed Precision
3. Data, Science & AI

senior-data-engineer

davila7

18.0K

Builds scalable data pipelines, ETL/ELT systems, and data infrastructure using Spark, Airflow, and dbt for data modeling and orchestration.

Spark
Airflow
dbt
3. Data, Science & AI

long-context

davila7

18.0K

Extends transformer model context windows using RoPE, YaRN, ALiBi, and position interpolation for processing long documents (32k-128k+ tokens).

RoPE
Position Interpolation
Transformer Context
3. Data, Science & AI

stable-baselines3

davila7

18.0K

Provides a library for training and experimenting with reinforcement learning agents using algorithms like PPO and SAC in Gym environments.

Reinforcement Learning
PPO
Gym
3. Data, Science & AI

cirq

davila7

18.0K

Framework for building, simulating, and executing quantum circuits, supporting quantum algorithms and hardware integration.

Quantum Circuits
Quantum Hardware
Circuit Optimization
3. Data, Science & AI

pdf-processing-pro

davila7

18.0K

Provides production-grade PDF processing for forms, tables, OCR, and batch operations with robust validation and error handling.

OCR
PDF Forms
Batch Processing
3. Data, Science & AI

cocoindex

davila7

18.0K

Comprehensive toolkit for building AI data pipelines including ETL workflows, vector embeddings, and knowledge graphs via CocoIndex library.

CocoIndex
Vector Embeddings
Knowledge Graphs
3. Data, Science & AI

deeptools

davila7

18.0K

Provides genomic data analysis and visualization for NGS workflows, including BAM to bigWig conversion, QC metrics, and heatmaps for ChIP-seq, RNA-seq, and ATAC-seq.

BAM
bigWig
ChIP-seq
3. Data, Science & AI
PreviousPage 4 of 417 PageNext