In this video, I have explained the mathematics behind the mixed model approach which we will perform to analyze clustered data.

One of the problems is seen in the field of biology, where observations are taken from clusters of repeated measurements taken from related units. So Linear Mixed Effects Model (LMM) is a model with both fixed and random effects and used when the independence assumption of OLS regression is violated from the dependence of observations taken from repeated measurements or within clusters.

Case study:

Here Reaction variable: repeated measure for each subject and taken 10 times. This is…


This article is all about detailed Base Model analysis of the Diabetes Data which includes the following analysis:

  1. Data exploration (Data distribution inferences, Univariate Data analysis, Two-sample t-test)
  2. Data Correlation Analysis

3. Feature Selection (using Logistic regression)

4. Outlier Detection (using principal component graph)

5. Basic Parameter Tuning(CV, complexity parameter)

6. Data modeling

Basic GLM (With all Features and eliminating few features based on AIC)

Logistic Regression

Decision Tree

Naïve Bayes

Ref: https://rb.gy/xej8wd

Basic EDA

Data can be downloaded from https://www.kaggle.com/uciml/pima-indians-diabetes-database


Based on the use-case our data speaks a lot of stories but depicting the appropriate meaning is different than just using pre-existing libraries to get some accuracy or results.

I feel Interpreting the results is always confusing especially P-value, Population, slope, or even simple Confidence Interval, what exactly these terms mean in finding or solving a problem statement.

What's your thought on these interpretations?

For understanding the concept of Linear Regression, this is a great post to start https://medium.com/analytics-vidhya/understanding-the-linear-regression-808c1f6941c0

Let's create a sample dataset,

This dataset contains ID, Gender (male = 1, female = 0), cholesterol, weight in kg, and height is cm. …


Clustering based on basic standards like density, shape, and size is very common. In a similar way, DBSCAN is an extensive method of the density-based clustering algorithm.

Reference: https://www.mdpi.com/2076-3417/9/20/4398/htm

For MapReduce check this article (https://medium.com/@rrfd/your-first-map-reduce-using-hadoop-with-python-and-osx-ca3b6f3dfe78)

Algorithm description:

  1. Choose a random point p.
  2. Fetch all points that are density-reachable from p with respect to eps and minPts.
  3. A cluster is formed if p is a core point.
  4. Visit the next point of the dataset, if p is a border point and none of the points is density-reachable from p.
  5. Repeat the above process until all the points have been examined.

Here I haven't…


by Byasa Kabi Fakir Mohan Senapati

॥ पिता धर्मः पिता स्वर्गः पिता ही परमं तपः

पितरि प्रीतिमापन्ने प्रीयन्ते सर्वदेवताः॥

“Father is my dharma, Father is my heaven, he is the ultimate penance of my life. If he is happy, all Deities are pleased.”

This is a story of the ill-treatment of an old father by his Educated son when the widower father accompanies the brash son to his place of work. In contrast, you will read Father’s unconditional compassion for his son.

Every parent has dreams to Educate their Child and in our Story, Hari Singh also did everything to…


This is a collection of different ways to find the distance between points.

You guys won’t believe it and it is funny also, I always cross the road diagonally. I wait for two signals to turn red.

This Pythagoras Theorem of calculating hypotenuse is kind of in my heart. Joke apart is not it true that diagonal is a way to find the shortest distance? It might not be the best way.

To calculate the distance AB between point A(x1,y1) and B(x2,y2), first, draw a right triangle that has the segment and calculate the hypotenuse.


This article is an overview of the mathematics behind the Generalized Linear Model (GLM) and Principal Components. I will break the whole article in various segments of Questions which I generally ask myself before implementing any method.

The general form of the model:

Geometrical view and projection matrices

Swayanshu Shanti Pragnya

M.S in Data Science and Bio-medicine(DSB)|Independant Researcher | Philosopher | Art&Abstract Lover https://www.linkedin.com/in/swayanshu/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store