Sometimes data can be complex and hard to fit using a simple model. However, we can make a model more complex with a set of rules. Consider the data of th efollowing form. It is problematic to describe the data well using only a linear model, however, we can potentially three separate linear models that will descrive data well.
Clustering is one of the ways to construct information granules. Specifically, the sole goal of clustering is to discover a hidden structure in numerical data in the form of clusters – information granules. One of the most common algorithms, K-Means Clustering, presents information granules in the form of sets, with a strict membership boundary. An alternative algorithm, Fuzzy C Means Clustering, allows to describe information granules in the form of fuzzy set - a set that is defined with a membership function \(A(x)\) that assigns a numerical value from the interval [0,1] to every element \(x\).
Consider the following problem. You develop some service that provides some content to a user. This could be music, movies, news, fun facts, etc. You want to do your job in the best possible way and always find the content that would interest your users the most.
Let us define the HMM model for assigning POS tags. Let’s assume that we observe the words in a sentence. The emission of words is governed by a hidden Markov process that explains the transition between PSO tags. This HMM model can be described with the following graph
Let’s explore application of RNN on a simple task of name gender classification. The data has the following format