Example: LSTM classifier for astronomical names

The International Astronomical Union has a set of guidelines for naming astronomical objects, some of which are specific (e.g., comet names are prefixed with type, year of discovery, etc.), while other are given more room for creativity (e.g., minor planet names should be "not too similar to an existing name of a Minor Planet or natural Planetary satellite"). The confluence of historical legacies and a modern discovery explosion has led to a large assortment of names, ranging from proper names to catalog numbers. Can we train a neural network to learn these naming rules automatically?

Recurrent neural networks are particularly good at learning sequence information. Here, we demonstrate a LSTM (long short-term memory) architecture recurrent neural network trained to classify these astronomical object names by the sequence of characters in their names. The overall architecture consists of an character embedding layer, single 64-dimensional LSTM layer, and a fully-connected dense layer with softmax activation to differentiate between 6 classes: star, galaxy, quasar, comet, asteroid, or planet.

Training was performed on 14,215 examples taken from Wikipedia tables. An accuracy of 99.2% was achieved in the 10%-split test set. Offline, the architecture and weights of the trained Keras model are serialized into a JSON file, which is loaded here and run in real-time entirely within the browser, performing dynamic classification on each streamed sample below.