Home
Resume
Articles
Categories
All
(89)
hugging face
(2)
llm
(89)
model building
(3)
research paper
(86)
tools
(1)
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
llm
research paper
May 16, 2024
Santosh Sawant
SUTRA: Scalable Multilingual language model architecture
llm
research paper
Recent advancements in Large Language Models (LLMs) have predominantly focused on a limited set of data-rich languages, with training datasets being notably skewed towards…
May 15, 2024
Santosh Sawant
Linearizing Large Language Models
llm
research paper
Over the last few years, Transformers have displaced Recurrent Neural Networks (RNNs) in sequence modeling tasks, owing to their highly parallel training efficiency and…
May 14, 2024
Santosh Sawant
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
llm
research paper
The use of retrieval-augmented generation (RAG) to retrieve relevant information from an external knowledge source enables large language models (LLMs) to answer questions…
May 13, 2024
Santosh Sawant
Is Flash Attention Stable?
llm
research paper
Given the size and complexity of workloads, training Large Language Models (LLMs) often takes months together, across hundreds or thousands of GPUs. For example, LLaMA2’s…
May 10, 2024
Santosh Sawant
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
llm
research paper
Optimizing LLMs operational cost and computation requirement is one of the sortout topics for researchers. Accelerated solutions deploy on mobile, edge devices or commodity…
May 9, 2024
Santosh Sawant
Better & Faster Large Language Models via Multi-token Prediction
llm
research paper
All Large language models such as GPT and Llama are trained with a next-token prediction loss. However, despite the recent wave of impressive achievements in LLMs…
May 8, 2024
Santosh Sawant
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
llm
research paper
Aligning Large Language Models (LLMs) with human values and preferences is essential for making them helpful and safe. However, building efficient tools to perform alignment…
May 7, 2024
Santosh Sawant
PROMETHEUS 2: An Open Source Language Model Specialized in Evaluating Other Language Models
llm
research paper
Proprietary LMs such as GPT-4 model-based evaluation have emerged as a scalable solution for assessing LM-generated text. However, concerns related to transparency…
May 3, 2024
Santosh Sawant
Octopus v4: Graph of language models
llm
research paper
LLMs have been effective in a wide range of applications, yet the most sophisticated models are often proprietary (GPT 4, Gemini) and considerably costly than open source…
May 2, 2024
Santosh Sawant
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
llm
research paper
Evaluating language models is a challenging task: not only is it difficult to find meaningful data to test the models, but evaluating the correctness of a generated response…
Apr 30, 2024
Santosh Sawant
Make Your LLM Fully Utilize the Context
llm
research paper
These days the training context windows of many contemporary LLMs have been expanded to tens of thousands of tokens, thereby enabling these models to process extensive…
Apr 29, 2024
Santosh Sawant
CodecLM: Aligning Language Models with Tailored Synthetic Data
llm
research paper
Recent progress in instruction tuned LLM highlights the critical role of high-quality data in enhancing LLMs’ instruction-following capabilities. However, acquiring such…
Apr 25, 2024
Santosh Sawant
LLM-R2 : A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency
llm
research paper
Recently, DB query rewrite using LLMs has been one of the sort out use cases. The aim of query rewrite is to output a new query equivalent to the original SQL query, while…
Apr 24, 2024
Santosh Sawant
TransformerFAM: Feedback attention is working memory
llm
research paper
While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. One of the widely used…
Apr 19, 2024
Santosh Sawant
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
llm
research paper
Recently Google has released RecurrentGemma, an open language model which uses Google’s novel Griffin architecture. Griffin combines linear RNN with local attention to…
Apr 18, 2024
Santosh Sawant
MEGALODON: Efficient LLM Pretraining and Inference with Unlimited Context Length
llm
research paper
The Transformer architecture is backbone of any production LLMs, but despite its remarkable capabilities, it faces challenges with quadratic computational complexity and…
Apr 17, 2024
Santosh Sawant
Trust Region Direct Preference Optimization (TR-DPO) : Learn Your Reference Model for Real Good Alignment
llm
research paper
Aligning large language models with human preferences (RLHF) has become increasingly important to ensure safety and overall usefulness of the model. Traditionally, the…
Apr 16, 2024
Santosh Sawant
RHO-1: Not All Tokens Are What You Need
llm
research paper
High quality training data sets are crucial to boost LLMs performance. Various data filtering techniques such as heuristics and classifiers are being utilized to select such…
Apr 15, 2024
Santosh Sawant
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
llm
research paper
Fine-tuning LLMs using Reinforcement Learning from Human Feedback (RLHF) has alway been a preferred way for making LLMs more useful by aligning them with human values or…
Apr 9, 2024
Santosh Sawant
Stream of Search (SoS): Learning to Search in Language
llm
research paper
Transformer-based auto-regressive models such as GPT have shown remarkable performance in generative tasks but struggle when it comes to complex decision-making and…
Apr 8, 2024
Santosh Sawant
ReFT: Representation Finetuning for Language Models
llm
research paper
Parameter-efficient finetuning (PEFT) methods have been instrumental in rapid adoption of fine tuned domain specific LLMs. PEFTs not only reduced memory usage and time…
Apr 5, 2024
Santosh Sawant
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
llm
research paper
Transformer FLOPs Equation or FLOPs-per-token is one of the key attributes in determining computation budget for any transformer base LLM models. Usually in language models…
Apr 4, 2024
Santosh Sawant
sDPO: Don’t Use Your Data All at Once
llm
research paper
As development of large language models (LLM) progresses, aligning them with human preferences has become increasingly important to ensure safety and usefulness of the…
Apr 3, 2024
Santosh Sawant
Gecko: Versatile Text Embeddings Distilled from Large Language Models
llm
research paper
Recent advancement in the Text Embedding model has been instrumental for various downstream tasks including document retrieval, sentence similarity, classification, and…
Apr 2, 2024
Santosh Sawant
Jamba: A Hybrid Transformer-Mamba Language Model
llm
research paper
Finally, the first production-grade commercially available Mamba-based model delivering best-in-class quality and performance is here. Introducing Jamba, a novel…
Apr 1, 2024
Santosh Sawant
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
llm
research paper
LLM empowered multi-modality inputs are becoming an essential part of Vision Language Models (VLMs) such as LLaVA and Otter. However, despite these advancements, a…
Mar 28, 2024
Santosh Sawant
RigorLLM: Resilient Guardrails for large language models against undesired content
llm
research paper
Large language models (LLMs) have demonstrated impressive capabilities in NLG and different downstream tasks. However, the potential of LLMs to produce biased or harmful…
Mar 27, 2024
Santosh Sawant
DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models
llm
research paper
Traditional methods of RAG typically rely on single-round retrieval, using the LLM’s initial input to retrieve relevant information from external corpora. While this method…
Mar 26, 2024
Santosh Sawant
SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series
llm
research paper
Recently, Structured State Space models (SSM) such as Mumba have been pitched as an for Transformer based models especially when it comes to increase efficiency and…
Mar 25, 2024
Santosh Sawant
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
llm
research paper
Vision language models (VLMs) like GPT-4, LLaMAadapter, and LLaVA have been instrumental in augmenting LLMs with visual understanding capabilities. VLMs serve as…
Mar 22, 2024
Santosh Sawant
Evolutionary Optimization of Model Merging Recipes
llm
research paper
Model merging offers a novel approach to leverage the strengths of multiple pre-trained models. It allows us to combine task-specific models, each potentially fine-tuned for…
Mar 21, 2024
Santosh Sawant
mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding
llm
research paper
Structure information is critical for understanding the semantics of text-rich images, such as documents, tables, and charts. Most of Existing Multimodal Large Language…
Mar 20, 2024
Santosh Sawant
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
llm
research paper
Reinforcement Learning from Human Feedback (RLHF) is one of the most popular methods to align Pretrained Large Language Models (LLMs) with human preferences. It involves…
Mar 19, 2024
Santosh Sawant
RAFT: Adapting Language Model to Domain Specific RAG
llm
research paper
Adapting LLMs to the specialized domains, which is essential to many emerging applications, usually takes two paths: in-context learning through Retrieval-Augmented…
Mar 18, 2024
Santosh Sawant
USER-LLM: Efficient LLM Contextualization with User Embeddings
llm
research paper
Large language models (LLMs) have revolutionized the field of user modeling and personalization due to its ability to learn and adapt from massive amounts of textual data.…
Mar 15, 2024
Santosh Sawant
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
llm
research paper
Factual correctness has been one of the growing concerns around LLMs reasoning capabilities. This issue becomes more significant when it comes to zero-shot CoT (Chain of…
Mar 14, 2024
Santosh Sawant
MoAI: Mixture of All Intelligence for Large Language and Vision Models
llm
research paper
Following the success of the instruction-tuned LLMs, several visual instruction tuning datasets have been meticulously curated to enhance zero-shot vision language (VL)…
Mar 13, 2024
Santosh Sawant
VideoMamba: State Space Model for Efficient Video Understanding
llm
research paper
Mastering spatiotemporal representation is one of the key areas in any video understanding task. However there usually are two challenges associated with it: (1) the large…
Mar 12, 2024
Santosh Sawant
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
llm
research paper
Adapter-based fine-tuning methods, such as LoRA, are key to making large language models disruptive in various domain specific applications. LoRA introduces a limited number…
Mar 11, 2024
Santosh Sawant
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
llm
research paper
Training Large Language Models (LLMs) is challenging due to memory constraints from weight and optimizer size. Low-rank adaptation (LoRA) addresses this by adding trainable…
Mar 8, 2024
Santosh Sawant
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding
llm
research paper
Mar 7, 2024
Santosh Sawant
Design2Code: How Far Are We From Automating Front-End Engineering?
llm
research paper
Recent releases of advanced multimodal LLMs such as GPT-4V and Gemini version pro have led to breakthroughs in visual and code generation understanding. This has opened up…
Mar 6, 2024
Santosh Sawant
DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model
llm
research paper
In recent years, text-to-image (T2I) generation models such as DreamBooth and BLIP-Diffusion have rapidly evolved, generating intricate and highly detailed images that often…
Mar 5, 2024
Santosh Sawant
VisionLLaMA : A Unified LLaMA Interface for Vision Tasks
llm
research paper
Large language models, especially the LLaMA family of models, aroused great interest in the research community for multimodal models application, where many methods heavily…
Mar 4, 2024
Santosh Sawant
Beyond Language Models: Byte Models are Digital World Simulators
llm
research paper
Bytes are the foundation of all digital data, devices, and software, from computer processors to operating systems in everyday electronics. Therefore, training models for…
Mar 1, 2024
Santosh Sawant
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
llm
research paper
Large Language Models (LLMs) have demonstrated remarkable performance in a wide range of natural language processing tasks, but their increasing size has posed challenges…
Feb 29, 2024
Santosh Sawant
ChunkLlama : Training-Free Long-Context Scaling of Large Language Models
llm
research paper
The ability to comprehend and process long-context information is essential for large language models (LLMs) to cater to a wide range of applications effectively. Finetuning…
Feb 28, 2024
Santosh Sawant
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
llm
research paper
MobiLlama, another Small Language Models (SLMs) for resource constrained devices. MobileLlama is a SLM design that initiates from a larger model and applies a careful…
Feb 27, 2024
Santosh Sawant
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
llm
research paper
Self-attention, one of the critical components in LLM, has a poor performance during inference since it performs intensive memory operations on key/value tensors of context…
Feb 26, 2024
Santosh Sawant
TinyLLaVA: A Framework of Small-scale Large Multimodal Models
llm
research paper
Large language models (LLMs) with large model size can greatly improve task performance but demand expensive computational resources for training. To address this, the LLM…
Feb 23, 2024
Santosh Sawant
The FinBen: An Holistic Financial Benchmark for Large Language Models
llm
research paper
Recent studies have shown the great potential of advanced LLMs such as GPT-4 on financial text analysis and prediction tasks in the financial domain. While their potential…
Feb 22, 2024
Santosh Sawant
GRIT : Generative Representational Instruction Tuning
llm
research paper
All text-based language problems can be reduced to either generation or embedding. Creating a single general model that performs such a wide range of tasks has been a…
Feb 16, 2024
Santosh Sawant
Aespa: Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
llm
research paper
With the increasing complexity of generative AI models, post-training quantization (PTQ) has emerged as a promising solution for deploying hyper-scale models on edge devices…
Feb 15, 2024
Santosh Sawant
Graph Mamba: Towards Learning on Graphs with State Space Models
llm
research paper
Graph Transformers (GTs) has shown promising potential in graph representation learning. GTs, however, have quadratic computational cost, lack inductive biases on graph…
Feb 14, 2024
Santosh Sawant
Fiddler: CPU-GPU Orchestration for Fast Local Inference of MoE Models
llm
research paper
Large Language Models (LLMs) based on Mixture-of-Experts (MoE) architectures are showing remarkable performance on various tasks. By activating a subset of experts inside…
Feb 13, 2024
Santosh Sawant
PHATGOOSE: Learning to Route Among Specialized Experts for Zero-Shot Generalization
llm
research paper
The availability of Huggingface PEFT modules has made it cheap and easy to modularly adapt a given pre-trained model to a specific task or domain. In the meantime, extremely…
Feb 12, 2024
Santosh Sawant
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
llm
research paper
General-purpose LLMs like LLaMA and GPT-4 have demonstrated remarkable proficiency in understanding and generating natural language. However, their capabilities wane in…
Feb 9, 2024
Santosh Sawant
Hydragen: High-Throughput LLM Inference with Shared Prefixes
llm
research paper
Transformer-based large language models (LLMs) such as OpenAI GPT3.5 and GPT4 are now deployed to hundreds of millions of users. LLM inference in such scenarios commonly…
Feb 8, 2024
Santosh Sawant
MambaFormer: Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
llm
research paper
State-space models (SSMs), such as Mamba, have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and…
Feb 7, 2024
Santosh Sawant
BlackMamba: Mixture of Experts for State-Space Models
llm
research paper
State-space models (SSMs) have recently demonstrated competitive performance to transformers at large-scale language modeling benchmarks while achieving linear time and…
Feb 6, 2024
Santosh Sawant
Repeat After Me: Transformers are Better than State Space Models at Copying
llm
research paper
Feb 5, 2024
Santosh Sawant
Re3val: Reinforced and Reranked Generative Retrieval
llm
research paper
The primary objective of retrieval models is to enhance the accuracy of answers by selecting the most relevant documents retrieved for a given query, ensuring models have…
Feb 2, 2024
Santosh Sawant
FIND: INterface for Foundation models’ embeDDings
llm
research paper
Foundation models across the vision and language domains, such as GPT4, DALLE-3, SAM and LLaMA etc., have demonstrated significant advancements in addressing open-ended…
Feb 1, 2024
Santosh Sawant
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
llm
research paper
Vision Language Models (VLMs), such as OpenAI’s GPT-4, Flamingo, BLIP-2 and LLaVA have demonstrated significant advancements in addressing open-ended visual…
Jan 31, 2024
Santosh Sawant
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
llm
research paper
For Large Vision-Language Models (LVLMs), scaling the model can effectively improve performance. However, expanding model parameters significantly increases the training and…
Jan 30, 2024
Santosh Sawant
EAGLE: Extrapolation Algorithm for Greater Language-model Efficiency
llm
research paper
Auto-regressive decoding has become the de facto standard for large language models (LLMs). This process generates output tokens one at a time, which makes the generation by…
Jan 29, 2024
Santosh Sawant
MambaByte: Token-free Selective State Space Model
llm
research paper
In December 2023, “Mamba : Linear-Time Sequence Modeling with Selective State Spaces” paper was release and with it the whole discussion about Mamba (SSM) been a viable…
Jan 25, 2024
Santosh Sawant
Instruction-Tune Llama2 with TRL
hugging face
llm
model building
This blog post is an extended guide on instruction-tuning Llama 2 from Meta AI. The idea of the blog post is to focus on creating the instruction dataset, which we can then…
Jan 25, 2024
Santosh Sawant
Towards Conversational Diagnostic AI
llm
research paper
With the Med-PaLM series of LLMs Google is one of the few companies you can claim expertise in building medical domain specific LLMs. The latest addition has been AMIE…
Jan 24, 2024
Santosh Sawant
ChatQA: Building GPT-4 Level Conversational QA Models
llm
research paper
With all open source LLM models trying to outperform GPT-4 one may wonder, which one has truly been successful in Conversational QA - one of the elementary use cases of LLMs.
Jan 23, 2024
Santosh Sawant
How to Fine-Tune LLMs with TRL
hugging face
llm
model building
Large Language Models or LLMs have seen a lot of progress in the last year. We went from now ChatGPT competitor to a whole zoo of LLMs, including Meta AI’s Llama 2, Mistrals …
Jan 23, 2024
Santosh Sawant
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
llm
research paper
Jan 22, 2024
Santosh Sawant
Merge Model using Mergekit
tools
llm
model building
Model merging is a technique that combines two or more LLMs into a single model. It’s a relatively new and experimental method to create new models for cheap (no GPU…
Jan 22, 2024
Santosh Sawant
Tuning Language Models by Proxy
llm
research paper
These days capabilities of large pretrained LLMs can be significantly enhanced for specific domains of interest or task using additional fine tuning. However, tuning these…
Jan 19, 2024
Santosh Sawant
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference
llm
research paper
Recently Microsoft DeepSpeed launched DeepSpeed-FastGen LLM serving framework, which offers up to 2.3x higher effective throughput compared to state-of-the-art systems like…
Jan 18, 2024
Santosh Sawant
Self-Evaluation Improves Selective Generation in Large Language Models
llm
research paper
Trustworthiness of LLMs output is one of the important considerations for safe deployment of LLMs in production.Once of the straightforward way to do so is by measuring…
Jan 17, 2024
Santosh Sawant
Self-RAG: Learning to Retrieve, Generate and Critique through Self-Reflections
llm
research paper
Self-RAG is a new framework to train an arbitrary LM to learn to retrieve, generate, and critique to enhance the factuality and quality of generations, without hurting the…
Jan 16, 2024
Santosh Sawant
Reciprocal Rank Fusion (RRF) with LambdaMART: Context Tuning for Retrieval Augmented Generation (RAG)
llm
research paper
RAG typically consists of three primary components: Tool Retrieval, Plan Generation, and Execution. Existing RAG methodologies rely heavily on semantic search for tool…
Jan 15, 2024
Santosh Sawant
Chain of Thought (CoT): The Impact of Reasoning Step Length on Large Language Models
llm
research paper
If you are doing prompt engineering for LLMs then you might have come across Chain of Thought (CoT) prompting, which is significant in improving the reasoning abilities of…
Jan 12, 2024
Santosh Sawant
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
llm
research paper
Introducing DistAttention, a distributed attention algorithm, and DistKV-LLM, a distributed LLM serving system, to improve the performance and resource management of…
Jan 11, 2024
Santosh Sawant
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon
llm
research paper
Activation Beacon is a plug-and-play module for large language models that allows them to process longer contexts with a limited context window, while preserving their…
Jan 10, 2024
Santosh Sawant
Improving Text Embeddings with Large Language Models using fine-tuned Mistral-7B LLM
llm
research paper
Check out a groundbreaking paper on improving text embeddings with large language models (LLMs) like GPT-4! The authors propose generating synthetic training data for text…
Jan 9, 2024
Santosh Sawant
DOCLLM: A Layout Aware Generative Language Models for Multi model document understanding
llm
research paper
Introducing DocLLM, a groundbreaking generative language model that can understand visually rich documents without the need for expensive image encoders. DocLLM uses a…
Jan 8, 2024
Santosh Sawant
Self-Play Fine-Tuning (SPIN): Converts Weak Language Models to Strong Language Models
llm
research paper
Self-Play Fine-Tuning (SPIN) is a new fine-tuning method to improve large language models (LLMs) without needing additional human-annotated data.
Jan 5, 2024
Santosh Sawant
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
llm
research paper
The paper provides a comprehensive taxonomy categorizing over 32 techniques for mitigating hallucinations in large language models (LLMs). It groups the techniques into…
Jan 4, 2024
Santosh Sawant
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
llm
research paper
With only four lines of code modification, the proposed method can effortlessly extend existing LLMs’ context window without any fine-tuning. This work elicits LLMs’…
Jan 3, 2024
Santosh Sawant
Mamba-Chat: A Chat LLM based on State Space Models
llm
research paper
Mamba-Chat is the first chat language model based on a state-space model architecture, not a transformer.
Jan 2, 2024
Santosh Sawant
KwaiAgents: Generalized Information-seeking Agent System with LLMs - 2 Open-source models fine tuned for agent systems! Better than GPT-3.5 turbo as an agent!
llm
research paper
Driven by curiosity, humans have continually sought to explore and understand the world around them, leading to the invention of various tools to satiate this…
Jan 1, 2024
Santosh Sawant
No matching items