3. Data, Science & AI
Found 9989 skills
grpo-rl-training
davila7
Provides expert guidance for fine-tuning language models using GRPO and RL with the TRL library, focusing on reasoning and task-specific training.
llava
davila7
Enables image understanding, visual question answering, and multi-turn image-based conversations using CLIP vision encoder and LLaMA language model.
tensorrt-llm
davila7
Accelerates LLM inference with NVIDIA TensorRT for high throughput, low latency, and quantization support on NVIDIA GPUs.
google-analytics
davila7
Analyzes Google Analytics data to provide insights on website performance, traffic patterns, and conversion rates, enabling data-driven optimization strategies.
speculative-decoding
davila7
Accelerates LLM inference with speculative decoding, Medusa, and lookahead techniques, achieving 1.5-3.6x speedup for real-time applications and constrained deployments.
nowait-reasoning-optimizer
davila7
Reduces chain-of-thought token usage 27-51% in reasoning models (QwQ, DeepSeek-R1) while preserving accuracy via NOWAIT technique.
axolotl
davila7
Provides expert guidance for fine-tuning LLMs using Axolotl with YAML configurations and LoRA/QLoRA techniques.
mlflow
davila7
Manages end-to-end machine learning lifecycle including experiment tracking, model versioning, and production deployment.
deepspeed
davila7
Expert guidance for distributed training with DeepSpeed, covering ZeRO optimization, pipeline parallelism, and mixed precision techniques.
gptq
davila7
Enables 4-bit quantization for LLMs, reducing memory usage by 4x and accelerating inference 3-4x on consumer GPUs with minimal accuracy loss.
benchling-integration
davila7
Integrates with Benchling R&D platform to automate lab data management, including DNA/protein registries, inventory, and ELN entries via API.
qiskit
davila7
Comprehensive toolkit for building, optimizing, and executing quantum circuits, supporting algorithms, simulations, and hardware execution for scientific and AI applications.
nnsight-remote-interpretability
davila7
Enables neural network interpretability experiments for large models (70B+) using nnsight with remote execution, eliminating local GPU requirements.
pysam
davila7
Python toolkit for reading, writing, and processing genomic data formats including SAM/BAM, VCF, and FASTA/FASTQ in next-generation sequencing workflows.
training-llms-megatron
davila7
Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with tensor, pipeline, and expert parallelism for maximum GPU efficiency.
bioservices
davila7
Provides a unified Python API for accessing and analyzing biological data across 40+ bioinformatics databases including UniProt, KEGG, and PubChem.
guidance
davila7
Guidance controls LLM outputs using regex and grammars to guarantee valid JSON, XML, and code generation, enforcing structured formats and enabling multi-step workflows.
cellxgene-census
davila7
Enables querying the CZ CELLxGENE Census for single-cell expression data with biological filtering and integration to scanpy/PyTorch for population-scale analysis.
brenda-database
davila7
Accesses BRENDA enzyme database via SOAP API to retrieve kinetic parameters, reaction equations, and organism data for biochemical research and metabolic pathway analysis.
qdrant-vector-search
davila7
High-performance vector similarity search engine for RAG and semantic search, enabling fast nearest neighbor and hybrid search in production systems.
serving-llms-vllm
davila7
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching for production API deployment.
weights-and-biases
davila7
Tracks ML experiments with automatic logging, real-time visualization, hyperparameter sweeps, and model registry management in a collaborative MLOps platform.
autogpt-agents
davila7
Platform for creating and deploying persistent autonomous AI agents that execute multi-step automation through visual workflows and continuous agent systems.
metabolomics-workbench-database
davila7
Programmatically accesses NIH Metabolomics Workbench database to query metabolite data, MS/NMR spectra, and study metadata for biomarker discovery and metabolomics research.