CRF Chunker for NLTK

29 Jan 2019 in Nlp on Algorithms

The cournerstone of any natural language understanding system is NLP algorithms. The classical set of algorithms constitutes an NLP ptocessing pipeline:

Probase

26 Jan 2019 in Science / Science on Knowledge base

Limitations of syntactic approaches:

Information Theoretic Approach to Classification

10 Oct 2018 in Machine_learning / Machine Learning on Classification

Logistic regression is the simplest form of classification. We all know that the cost function is the cross entropy loss. But why?

Consider the task of classification, where you solve a problem of mapping a set of features \(X\) to a target label \(y\), so that \(C(X)=y\), where \(C\) is your classification function. Now, it is highly possible that your set of features does not provide a perfect explanation of the target class, and thus you may find several data samples that are identical, but have different class labels, i.e. \(X_1=X_2, C(X_1)=y_1, C(X_2)=y_2\). This is usually called “noisy data”. The process of training a model condescends to finding a function \(C\) that classifies correctly, but what is the measure of correctness, especially in the presence of the noise?

Simulated Annealing

10 Sep 2018 in Projects / Optimization on Stochastic optimization

Some choose in life should be made by sampling randomly.

CRF Chunker for NLTK

Probase

Information Theoretic Approach to Classification

Simulated Annealing

Motivation

Under The Lamp Light

Error

Motivation

Pagination

Templates (for web app):

Error