Natural language refers to the way humans communicate with each other. Natural Language Processing( NLP) is broadly defined as the automatic manipulation of natural language. The study of natural language processing has been around for more than 50 years and grew out of the field of linguistics with the rise of computers. Deep learning methods have been evaluated in a broader suite of problems from NLP and achieved greatest success on challenging and interesting problems such as text classification, language modeling, speech recognition, caption generation, machine translation, document summarization and question answering.

Text data requires more cleaning and preprocessing than…

The convolutional layer in a convolutional neural network (CNN) systematically applies filters of features to an input and creates output feature maps. It is challenging to configure the related hyperparameter — stride of the filter on the input image to downsample the size of the output feature map. The output feature maps are sensitive to the location of the features in the input. Down sampling the feature maps can address the sensitivity, which is an approach for pooling layers to down sampling feature maps by summarizing the presence of features in patches of the feature map.

The stride is how…

The convolutional neural network (CNN) is a specialized type of neural network model designed for working with two-dimensional image data. Since detailed images can have incredibly high dimensions based on the number of pixels, CNNs provide convolutional operations for analyzing groups of pixels, which makes the fitting neural networks to large images feasible. Due to its properties, CNNs are important and useful for image classification, object detection in images, picture neural style transfer etc.

To build a CNN neural network, you start with initializing a sequential model and go on adding layers. Besides simply adding additional dense layers or dropouts…

Neural networks can be powerful for classification problems when you have unstructured data such as text or images. There is a type of neural networks that work particularly well on image data: Convolutional Neural Networks (CNNs). CNNs are a useful model for image recognition due to their ability to recognize visual patterns at varying scales. For image processing, densely connected networks can really grow very big if we have high resolution images. Therefore, CNNs are often preferred over densely connected networks. This blog is to give an introduction to the mechanism of CNNs.

The essence of a CNN is the…

Compared to traditional vanilla RNNs (recurrent neural networks), there are two advanced types of neurons: LSTM (long short-term memory neural network) and GRU (gated recurrent unit). In this blog, we will give a introduction to the mechanism, performance and effectiveness of the two neuron networks.

In standard RNNs, sigmoid or hyperbolic tangent activation function is generally used as an activation function. There are large areas of each function where the derivative is very close to 0, which means the weight updates are small, and RNNs get saturated. When the values of gradients are are extremely low or high, it is…

Recurrent Neural Networks (RNNs) are a special type of neural networks designed for sequence problems. RNNs add the explicit handling of order between observations when approximating a mapping function from input variables to output variables, which is capable to predict time series for neural networks. Traditional time series forecasting methods like ARIMA focus on univariate data with linear relationships. However, RNNs add the capability to learn possibly noisy and nonlinear relationships and provide direct support for multivariate and multi-step forecasting. This blog gives a brief introduction of the mechanism of RNN.

The trait of RNN is to evaluate sequences of…

There are many prediction problems that involve a time component such as forecasting some yield each year, forecasting some price each day, forecasting some rate each hour etc., which makes the problems more difficult to handle. This blog will introduce machine learning techniques to better analyze and predict time series.

A time series can be decomposed into four constituent components: level (baseline value), trend (linear behavior), seasonality (the periodic behavior) and noise. According to the number of observations recorded at each time, the dataset can be marked as univariate time series and multivariate time series. …

Options are financial derivatives based on the value of underlying securities. They give the buyer the right to buy (call options) or sell (put options) the underlying asset at a pre-determined price within a specific timeframe. There are also two basic styles of options: American and European. American options can be exercised any time before the expiration date of the option, whereas European options can only be exercised on the expiration date. This blog digged into an option-pricing model to understand the evaluation of European options.

The Black-Scholes model or Black-Scholes-Merton model is a mathematical model for pricing an options…

Monte Carlo simulation is a computerized mathematical technique that relys on repeated random sampling to obtain numerical results. It is used to model the probability of different outcomes in a process which is impractical or impossible to solve analytically. The modern version of this technique was first used to work on nuclear weapons projects. It is named after the Monte Carlo Casino in Monaco. The technique is used by professionals to tackle a wide range of fields such as finance, insurance, manufacturing, engineering, transportation, and science. …

Linear regression is an important predictive analytical tool in the data scientist’s toolbox. In this blog, we implement least squares to approximate solutions of over-determined systems of linear equations by minimizing the sum of the squares of the errors in the equations. An introduction of how to use linear algebra to solve regression problems into machine learning and predictive analysis is reported.

Linear equation has no solution when the matrix has more rows than columns, which means there are more equations than unknowns. Therefore, we cannot always get the error down to zero. However, a least squares solution can be…

What happened couldn’t have happened any other way…