Posts

Retrieval Augmented Generation(RAG)

Image
 We Know core ideas like NLP, LLM. Let's look at how we might build utilizing LLM and the deployment aspect using RAG. Retrieval-Augmented Generation (RAG) is a technique in natural language processing that combines retrieval-based and generation-based approaches to create more accurate and informative responses. Let's break down each point: 1. RAG Architecture  Architecture: Retriever: Retrieves relevant documents or passages from a large corpus based on the input query. Reader/Generator: Uses the retrieved documents to generate a coherent and relevant response. Combiner: Integrates the retrieved information with the generated response. 2. Pipeline Behind the RAG The RAG pipeline is a collection of the above components working together: Query Processing: The input query is encoded into a vector. Retrieval: The encoded query is used to retrieve relevant documents from a vector database. Document Encoding: The retrieved documents are encoded into vectors. Information Inte...

Large Language Models(LLMs)

Image
  Language Model A language model is a type of artificial intelligence model designed to understand, generate, and manipulate human language. Language models are trained on large datasets of text to learn patterns in language use, including grammar, vocabulary, and context. Key functions include: Text Generation : Producing coherent and contextually relevant text. Text Completion : Predicting the next word or phrase in a given context. Machine Translation : Translating text from one language to another. Sentiment Analysis : Determining the sentiment expressed in a piece of text. Text Summarization : Creating concise summaries of longer texts. Language models can be based on various architectures, such as Recurrent Neural Networks (RNNs) and Transformer models. Large Language Model (LLM) A large language model (LLM) is a type of language model that is characterized by its large size, typically measured by the number of parameters (weights) it has. LLMs are trained on vast amounts ...

Transformers: Self-attention

Image
 In the realm of neural networks, three primary types are commonly discussed: Artificial Neural Networks (ANNs) : These are fully connected networks comprising input, hidden, and output layers. Each neuron in a layer is connected to every neuron in the subsequent layer, enabling complex pattern recognition through weighted connections. Convolutional Neural Networks (CNNs) : CNNs incorporate convolutional layers with kernels (filters) that slide across the input data to detect features. These networks also perform pooling operations to reduce dimensionality and flatten the data before passing it to fully connected layers. CNNs are particularly effective for image and spatial data processing. Recurrent Neural Networks (RNNs) : RNNs are designed to handle sequential data by maintaining a memory of previous inputs through their hidden states. Unlike feed-forward networks, RNNs can process input sequences of variable length, making them suitable for tasks involving time series data, nat...

Hypothesis Testing - Statistics

  1. What is Hypothesis Testing and when do we use it? Hypothesis testing is a part of statistical analysis, where we test the assumptions made regarding a population parameter. It is generally used when we were to compare: a single group with an external standard two or more groups with each other A  Parameter  is a number that describes the data from the  population  whereas, a  Statistic  is a number that describes the data from a  sample . 2. Terminology used Null Hypothesis:  Null hypothesis is a statistical theory that suggests there is no statistical significance exists between the populations. Alternative Hypothesis:  An Alternative hypothesis suggests there is a significant difference between the population parameters. It could be greater or smaller. Basically, it is the contrast of the Null Hypothesis. Note: H 0  must always contain equality(=). H a  always contains difference( ≠,  >, <). For example, if we...

Z Score - Normal Distribution

 -> We are going to have a deep discussion on the Z score. But right before that, we need to understand what a normal distribution and a standard normal distribution are. What is a distribution? A distribution in statistics is a function that shows the possible values for a variable and how often they occur. it may occur with various different values like age, height, the weight of people. What is a Normal distribution? The normal distribution is a distribution that is symmetric about the mean(mean is nothing but average of all the observations). Most of the observations in the normal distribution are surrounded by the mean. What is a standard normal distribution? The standard normal distribution is a normal distribution whose mean and standard deviation are scaled at 0 and 1 respectively. Z score can only be calculated for the observations which follow a normal distribution. What is a Z score? A Z-score is a numerical measurement that describes a value’s relationship to the mea...

Machine Learning Cross Validation

Image
 There are various pipelines for the machine learning use cases: Data collection, Feature engineering, Feature selection, Model creation, and Model deployment. So always remember before model creation what we do is whenever we have a dataset suppose I have 1000 records we usually perform a train test split saying that Training set has 70% of data and Test set have 30% or 80% train set and 20% test set depends on the count of the dataset. so our model will use 70% of the data to only train the model itself and the remaining 30% we will use to check the accuracy so when we do train test split 70% of data will randomly select and 30% also randomly select so when this kind of random selection happens so the type of data present in the test may not present in train set due to this our model accuracy go down. so whenever we use train test split we usually use random state, It will randomly select the data point so when we take random state=0 then it will shuffle our data and provide accu...