Machine Learning NYC Neighborhoods

13 Jan 2016

Check out the NYC Neighborhood Predictor! Prediction Matrix

It uses AWS Machine Learning to predict which neighborhood a string of text originates from.

Overview

Uses a simple Elixir and Phoenix Web app to expose an AWS Machine Learning Real-Time Endpoint.
Source Code

Machine Learning What?

Using a dataset of ~1G of geo-tagged tweets, we create a CSV with two columns: text and neighborhood.
After training and evaluating a machine learning (ML) model with this data, we expose the real-time endpoint via this elixir application.

Takeaways

Molding the training data to create a better model is the real challenge here.
Does my data even have statistical correlations or is it just noise?
Iterate, iterate, and iterate again on the model and evaluation data is what needs to be done here.

Input Schema

{
  "version": "1.0",
    "targetAttributeName": "Neighborhood",
    "dataFormat": "CSV",
    "dataFileContainsHeader": true,
    "attributes": [
    {
      "attributeName": "Text",
      "attributeType": "TEXT"
    },
    {
      "attributeName": "Neighborhood",
      "attributeType": "CATEGORICAL"
    }
    ],
    "excludedAttributeNames": []
}

Prediction Matrix Neighborhood Categories

dimroc blog See Experiments

Machine Learning NYC Neighborhoods

Overview

Machine Learning What?

Takeaways

Input Schema

Related Posts

Game to Video: Using Generative AI to Uprender Gameplay 11 Jun 2023

Books Worth Reading 2019 31 Dec 2019

Books Worth Reading 2018 31 Dec 2018