3. Data, Science & AI

24 skills

Found 9989 skills

Total Stars:6.7M

Avg Stars:667

Sort by:Stars Desc Stars Asc Name A-Z Name Z-A

grpo-rl-training

davila7

18.0K

Provides expert guidance for fine-tuning language models using GRPO and RL with the TRL library, focusing on reasoning and task-specific training.

GRPO

TRL

3. Data, Science & AI

llava

davila7

18.0K

Enables image understanding, visual question answering, and multi-turn image-based conversations using CLIP vision encoder and LLaMA language model.

CLIP

LLaMA

Vision-Language

3. Data, Science & AI

tensorrt-llm

davila7

18.0K

Accelerates LLM inference with NVIDIA TensorRT for high throughput, low latency, and quantization support on NVIDIA GPUs.

NVIDIA TensorRT

LLM Inference

Quantization

3. Data, Science & AI

google-analytics

davila7

18.0K

Analyzes Google Analytics data to provide insights on website performance, traffic patterns, and conversion rates, enabling data-driven optimization strategies.

Google Analytics

Traffic Analysis

Conversion Rate

3. Data, Science & AI

speculative-decoding

davila7

18.0K

Accelerates LLM inference with speculative decoding, Medusa, and lookahead techniques, achieving 1.5-3.6x speedup for real-time applications and constrained deployments.

Speculative Decoding

Medusa

Lookahead Decoding

3. Data, Science & AI

nowait-reasoning-optimizer

davila7

18.0K

Reduces chain-of-thought token usage 27-51% in reasoning models (QwQ, DeepSeek-R1) while preserving accuracy via NOWAIT technique.

NOWAIT

Chain-of-Thought

Token Efficiency

3. Data, Science & AI

axolotl

davila7

18.0K

Provides expert guidance for fine-tuning LLMs using Axolotl with YAML configurations and LoRA/QLoRA techniques.

Axolotl

LoRA

YAML

3. Data, Science & AI

mlflow

davila7

18.0K

Manages end-to-end machine learning lifecycle including experiment tracking, model versioning, and production deployment.

MLflow

Experiment Tracking

Model Registry

3. Data, Science & AI

deepspeed

davila7

18.0K

Expert guidance for distributed training with DeepSpeed, covering ZeRO optimization, pipeline parallelism, and mixed precision techniques.

DeepSpeed

ZeRO

Pipeline Parallelism

3. Data, Science & AI

gptq

davila7

18.0K

Enables 4-bit quantization for LLMs, reducing memory usage by 4x and accelerating inference 3-4x on consumer GPUs with minimal accuracy loss.

GPTQ

4-bit

LLM

3. Data, Science & AI

benchling-integration

davila7

18.0K

Integrates with Benchling R&D platform to automate lab data management, including DNA/protein registries, inventory, and ELN entries via API.

Benchling

API

ELN

3. Data, Science & AI

qiskit

davila7

18.0K

Comprehensive toolkit for building, optimizing, and executing quantum circuits, supporting algorithms, simulations, and hardware execution for scientific and AI applications.

Quantum Circuits

Quantum Algorithms

Quantum Machine Learning

3. Data, Science & AI

nnsight-remote-interpretability

davila7

18.0K

Enables neural network interpretability experiments for large models (70B+) using nnsight with remote execution, eliminating local GPU requirements.

nnsight

Neural Network Interpretability

PyTorch

3. Data, Science & AI

pysam

davila7

18.0K

Python toolkit for reading, writing, and processing genomic data formats including SAM/BAM, VCF, and FASTA/FASTQ in next-generation sequencing workflows.

SAM/BAM

VCF

NGS

3. Data, Science & AI

training-llms-megatron

davila7

18.0K

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with tensor, pipeline, and expert parallelism for maximum GPU efficiency.

Megatron-Core

Tensor Parallelism

Expert Parallelism

3. Data, Science & AI

bioservices

davila7

18.0K

Provides a unified Python API for accessing and analyzing biological data across 40+ bioinformatics databases including UniProt, KEGG, and PubChem.

Bioinformatics

UniProt

KEGG

3. Data, Science & AI

guidance

davila7

18.0K

Guidance controls LLM outputs using regex and grammars to guarantee valid JSON, XML, and code generation, enforcing structured formats and enabling multi-step workflows.

Regex

Constrained Generation

JSON

3. Data, Science & AI

cellxgene-census

davila7

18.0K

Enables querying the CZ CELLxGENE Census for single-cell expression data with biological filtering and integration to scanpy/PyTorch for population-scale analysis.

cellxgene

scanpy

single-cell

3. Data, Science & AI

brenda-database

davila7

18.0K

Accesses BRENDA enzyme database via SOAP API to retrieve kinetic parameters, reaction equations, and organism data for biochemical research and metabolic pathway analysis.

BRENDA

SOAP

Enzyme Kinetics

3. Data, Science & AI

qdrant-vector-search

davila7

18.0K

High-performance vector similarity search engine for RAG and semantic search, enabling fast nearest neighbor and hybrid search in production systems.

Vector Search

RAG

Hybrid Search

3. Data, Science & AI

serving-llms-vllm

davila7

18.0K

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching for production API deployment.

vLLM

PagedAttention

Quantization

3. Data, Science & AI

weights-and-biases

davila7

18.0K

Tracks ML experiments with automatic logging, real-time visualization, hyperparameter sweeps, and model registry management in a collaborative MLOps platform.

MLOps

Hyperparameter Sweeps

Model Registry

3. Data, Science & AI

autogpt-agents

davila7

18.0K

Platform for creating and deploying persistent autonomous AI agents that execute multi-step automation through visual workflows and continuous agent systems.

AutoGPT

Autonomous Agents

Multi-step Automation

3. Data, Science & AI

metabolomics-workbench-database

davila7

18.0K

Programmatically accesses NIH Metabolomics Workbench database to query metabolite data, MS/NMR spectra, and study metadata for biomarker discovery and metabolomics research.

Metabolomics

MS/NMR

m/z

3. Data, Science & AI

PreviousPage 3 of 417 PageNext