1. Login to the Nexosis API
2. Click Datasets to see a list of all datasets that have been uploaded, as well as the size of each dataset, the date it was first uploaded, and the date it was last modified.
3. Identify the dataset you wish to use to build a model and click
4. Enter a Session name of your choice and click
5. Select a Session Type and click
If you choose Classification, we will ask whether or not you wish to balance your test set
If you choose Forecast, you must indicate the period of time over which you want to generate your forecast by entering a start and end date
If you choose Impact Analysis, you must indicate the date range over which you're trying to measure the impact of an event by entering a start and end date
If you choose Anomaly Detection, we will ask whether your dataset contains anomalies or not
6. On the Column Metadata tab, specify the column which should be the target of your session. The target column will have results generated by sessions, and only ONE target column may be specified at a time. To specify a target, you can either click the Role drop down for the column name you wish to predict and select Target, or you can start to type in the text box and select the column name when it pops up. Click Next.
You also have the option to change the data type and imputation. Check out our Guides to learn more about Column Metadata.
7. Confirm the session setup and click
Once a session has started, you will receive updates in the status log showing Pending, Started, and Completed. In the background, the Nexosis API will automatically go through a multi-stage elimination process to narrow down hundreds of algorithms to a handful that are applicable to your data. Several techniques such as data categorization, smoothing, aggregation and imputation are used to determine the type of data, remove outliers, assign an appropriate granularity, and fill in missing values. Using the processed data, several models are built to find the one that best fits your dataset. Your dataset will also be split into 80% training and 20% test sets to cross-validate the results.