3. Data, Science & AI

24 skills

Found 9989 skills

Total Stars:6.7M
Avg Stars:667

grpo-rl-training

davila7

18.0K

Provides expert guidance for fine-tuning language models using GRPO and RL with the TRL library, focusing on reasoning and task-specific training.

GRPO
RL
TRL
3. Data, Science & AI

llava

davila7

18.0K

Enables image understanding, visual question answering, and multi-turn image-based conversations using CLIP vision encoder and LLaMA language model.

CLIP
LLaMA
Vision-Language
3. Data, Science & AI

tensorrt-llm

davila7

18.0K

Accelerates LLM inference with NVIDIA TensorRT for high throughput, low latency, and quantization support on NVIDIA GPUs.

NVIDIA TensorRT
LLM Inference
Quantization
3. Data, Science & AI

google-analytics

davila7

18.0K

Analyzes Google Analytics data to provide insights on website performance, traffic patterns, and conversion rates, enabling data-driven optimization strategies.

Google Analytics
Traffic Analysis
Conversion Rate
3. Data, Science & AI

speculative-decoding

davila7

18.0K

Accelerates LLM inference with speculative decoding, Medusa, and lookahead techniques, achieving 1.5-3.6x speedup for real-time applications and constrained deployments.

Speculative Decoding
Medusa
Lookahead Decoding
3. Data, Science & AI

nowait-reasoning-optimizer

davila7

18.0K

Reduces chain-of-thought token usage 27-51% in reasoning models (QwQ, DeepSeek-R1) while preserving accuracy via NOWAIT technique.

NOWAIT
Chain-of-Thought
Token Efficiency
3. Data, Science & AI

axolotl

davila7

18.0K

Provides expert guidance for fine-tuning LLMs using Axolotl with YAML configurations and LoRA/QLoRA techniques.

Axolotl
LoRA
YAML
3. Data, Science & AI

mlflow

davila7

18.0K

Manages end-to-end machine learning lifecycle including experiment tracking, model versioning, and production deployment.

MLflow
Experiment Tracking
Model Registry
3. Data, Science & AI

deepspeed

davila7

18.0K

Expert guidance for distributed training with DeepSpeed, covering ZeRO optimization, pipeline parallelism, and mixed precision techniques.

DeepSpeed
ZeRO
Pipeline Parallelism
3. Data, Science & AI

gptq

davila7

18.0K

Enables 4-bit quantization for LLMs, reducing memory usage by 4x and accelerating inference 3-4x on consumer GPUs with minimal accuracy loss.

GPTQ
4-bit
LLM
3. Data, Science & AI

benchling-integration

davila7

18.0K

Integrates with Benchling R&D platform to automate lab data management, including DNA/protein registries, inventory, and ELN entries via API.

Benchling
API
ELN
3. Data, Science & AI

qiskit

davila7

18.0K

Comprehensive toolkit for building, optimizing, and executing quantum circuits, supporting algorithms, simulations, and hardware execution for scientific and AI applications.

Quantum Circuits
Quantum Algorithms
Quantum Machine Learning
3. Data, Science & AI

nnsight-remote-interpretability

davila7

18.0K

Enables neural network interpretability experiments for large models (70B+) using nnsight with remote execution, eliminating local GPU requirements.

nnsight
Neural Network Interpretability
PyTorch
3. Data, Science & AI

pysam

davila7

18.0K

Python toolkit for reading, writing, and processing genomic data formats including SAM/BAM, VCF, and FASTA/FASTQ in next-generation sequencing workflows.

SAM/BAM
VCF
NGS
3. Data, Science & AI

training-llms-megatron

davila7

18.0K

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with tensor, pipeline, and expert parallelism for maximum GPU efficiency.

Megatron-Core
Tensor Parallelism
Expert Parallelism
3. Data, Science & AI

bioservices

davila7

18.0K

Provides a unified Python API for accessing and analyzing biological data across 40+ bioinformatics databases including UniProt, KEGG, and PubChem.

Bioinformatics
UniProt
KEGG
3. Data, Science & AI

guidance

davila7

18.0K

Guidance controls LLM outputs using regex and grammars to guarantee valid JSON, XML, and code generation, enforcing structured formats and enabling multi-step workflows.

Regex
Constrained Generation
JSON
3. Data, Science & AI

cellxgene-census

davila7

18.0K

Enables querying the CZ CELLxGENE Census for single-cell expression data with biological filtering and integration to scanpy/PyTorch for population-scale analysis.

cellxgene
scanpy
single-cell
3. Data, Science & AI

brenda-database

davila7

18.0K

Accesses BRENDA enzyme database via SOAP API to retrieve kinetic parameters, reaction equations, and organism data for biochemical research and metabolic pathway analysis.

BRENDA
SOAP
Enzyme Kinetics
3. Data, Science & AI

qdrant-vector-search

davila7

18.0K

High-performance vector similarity search engine for RAG and semantic search, enabling fast nearest neighbor and hybrid search in production systems.

Vector Search
RAG
Hybrid Search
3. Data, Science & AI

serving-llms-vllm

davila7

18.0K

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching for production API deployment.

vLLM
PagedAttention
Quantization
3. Data, Science & AI

weights-and-biases

davila7

18.0K

Tracks ML experiments with automatic logging, real-time visualization, hyperparameter sweeps, and model registry management in a collaborative MLOps platform.

MLOps
Hyperparameter Sweeps
Model Registry
3. Data, Science & AI

autogpt-agents

davila7

18.0K

Platform for creating and deploying persistent autonomous AI agents that execute multi-step automation through visual workflows and continuous agent systems.

AutoGPT
Autonomous Agents
Multi-step Automation
3. Data, Science & AI

metabolomics-workbench-database

davila7

18.0K

Programmatically accesses NIH Metabolomics Workbench database to query metabolite data, MS/NMR spectra, and study metadata for biomarker discovery and metabolomics research.

Metabolomics
MS/NMR
m/z
3. Data, Science & AI
PreviousPage 3 of 417 PageNext