3. Data, Science & AI

24 skills

Found 9989 skills

Total Stars:6.7M
Avg Stars:667

learn-from-pr

dotnet

23.2K

Analyzes completed PRs with agent involvement to extract behavioral lessons, identify patterns, and generate actionable recommendations for improving agent skills and documentation.

Pull Request
AI Agent
Pattern Recognition
3. Data, Science & AI

domain-ml

rustfs

20.1K

Enables development of machine learning and AI applications in Rust, supporting model training, inference, and deep learning with Rust libraries.

tch-rs
burn
candle
3. Data, Science & AI

snowflake-semanticview

github

18.4K

Creates, alters, and validates Snowflake semantic views using Snowflake CLI, including DDL validation and setup guidance.

Snowflake
Semantic View
Snowflake CLI
3. Data, Science & AI

model-merging

davila7

18.0K

Merges fine-tuned AI models using techniques like SLERP and Task Arithmetic to combine domain expertise without retraining, enhancing performance and enabling rapid experimentation.

mergekit
SLERP
Task Arithmetic
3. Data, Science & AI

shap

davila7

18.0K

Provides SHAP-based model interpretability for explaining predictions, feature importance, and bias analysis across ML models.

SHAP
Model Interpretability
Feature Importance
3. Data, Science & AI

statistical-analysis

davila7

18.0K

Comprehensive statistical analysis toolkit for academic research, including hypothesis testing, regression, Bayesian methods, and APA reporting.

Hypothesis Testing
Regression
Bayesian Statistics
3. Data, Science & AI

pubchem-database

davila7

18.0K

Queries PubChem database for chemical compounds, supporting searches by name, CID, SMILES, and retrieving properties, bioactivity, and similarity data.

PubChem
SMILES
Bioactivity
3. Data, Science & AI

biomni

davila7

18.0K

Autonomous AI framework for biomedical research, executing complex tasks in genomics, drug discovery, and clinical analysis using LLM reasoning and integrated databases.

Genomics
Drug Discovery
LLM
3. Data, Science & AI

esm

davila7

18.0K

Comprehensive toolkit for protein language models (ESM3, ESM C) enabling sequence, structure, and function prediction, design, and engineering tasks.

ESM3
Protein Language Models
Inverse Folding
3. Data, Science & AI

hqq-quantization

davila7

18.0K

Enables 4/3/2-bit quantization of LLMs without calibration data, accelerating deployment via vLLM and HuggingFace Transformers.

Half-Quadratic Quantization
LLM Quantization
vLLM
3. Data, Science & AI

simpo-training

davila7

18.0K

Provides a reference-free, efficient alternative to DPO for LLM preference alignment, achieving better performance with simpler training.

SIMPO
DPO
LLM Alignment
3. Data, Science & AI

torchdrug

davila7

18.0K

PyTorch-based toolkit for biomedical graph machine learning, featuring GNNs for molecular property prediction, protein modeling, and drug discovery tasks.

PyTorch
GNNs
Drug Discovery
3. Data, Science & AI

kegg-database

davila7

18.0K

Direct REST API access to KEGG database for academic research, enabling pathway analysis, gene mapping, and metabolic pathway exploration.

KEGG
REST API
Pathway Analysis
3. Data, Science & AI

scikit-learn

davila7

18.0K

A Python library for machine learning tasks including classification, regression, clustering, and model evaluation.

scikit-learn
Classification
Clustering
3. Data, Science & AI

hmdb-database

davila7

18.0K

Accesses Human Metabolome Database for searching metabolites, retrieving chemical properties, spectra, and pathways to support metabolomics research.

HMDB
Metabolomics
NMR
3. Data, Science & AI

evaluating-code-models

davila7

18.0K

Evaluates code generation models across HumanEval, MBPP, and MultiPL-E benchmarks using pass@k metrics for quality assessment.

HumanEval
MBPP
pass@k
3. Data, Science & AI

pinecone

davila7

18.0K

Provides a fully managed vector database for production AI applications, supporting RAG, semantic search, and scalable recommendation systems with low latency.

Vector Database
RAG
Semantic Search
3. Data, Science & AI

sglang

davila7

18.0K

Accelerates LLM inference with RadixAttention prefix caching for structured JSON/regex outputs, constrained decoding, and agentic workflows.

RadixAttention
Structured Generation
Constrained Decoding
3. Data, Science & AI

polars

davila7

18.0K

High-performance data manipulation library using Apache Arrow for efficient filtering, grouping, and I/O operations in data analysis workflows.

Apache Arrow
DataFrame
Lazy Evaluation
3. Data, Science & AI

openrlhf-training

davila7

18.0K

High-performance RLHF framework for training large language models (7B-70B+) using Ray, vLLM, and ZeRO-3, supporting PPO, DPO, and distributed training.

Ray
vLLM
RLHF
3. Data, Science & AI

awq-quantization

davila7

18.0K

Provides activation-aware weight quantization for 4-bit LLM compression, enabling 3x speedup with minimal accuracy loss on limited GPU memory.

AWQ
4-bit quantization
LLM compression
3. Data, Science & AI

rwkv-architecture

davila7

18.0K

Provides an RNN-Transformer hybrid architecture for efficient AI inference with linear time complexity, infinite context, and production-ready large language models.

RWKV
RNN-Transformer
O(n) Inference
3. Data, Science & AI

networkx

davila7

18.0K

Comprehensive Python toolkit for creating, analyzing, and visualizing complex networks and graphs with graph algorithms and community detection capabilities.

NetworkX
Graph Algorithms
Network Analysis
3. Data, Science & AI

constitutional-ai

davila7

18.0K

Provides constitutional AI training for aligning language models with human values using self-critique and AI feedback to reduce harmful outputs without human labels.

Constitutional AI
RLAIF
Self-critique
3. Data, Science & AI
PreviousPage 2 of 417 PageNext