ChaosHunter(CH) begins with a set of suggested inputs and functions and is known for its ability to find formulas that model a designated output. But the question becomes is that a prediction, classification, or something else? The answer is all of the above. What ChaosHunter is actually doing depends upon two things:
1. The optimization goal function you have selected, and
2. The output you have chosen
The following outline may help you understand.
1. Optimization Goal is selected under "Trading Strategies"
In this case CH is NOT looking for a formula that is trying to match some output. It is looking for a formula that makes the most money when trading rules are applied to the values the formula is producing. The "Output" that you choose is supposed to be a price time series that CH will use to determine fill prices, i.e., what prices you get when you buy and sell. There are two trading strategy goals:
A. Buy/sell cutoff. Here the formula values produced are compared to thresholds to determine whether a buy or sell signal is generated. If the formula output is greater than or equal to some constant (determined by CH), a buy signal is generated. If the formula output is less than or equal to some constant (determined by CH), a sell signal is generated.
B. Buy/sell/true/false. Here it is expected that your formula operations are largely chosen from the Boolean and Relational categories. Buy signals are generated when the formula produces True (not zero). Selling takes place when the formula produces False (zero).
2. Optimization Goal is NOT selected under "Trading Strategies"
In this case CH IS looking for a formula that is trying to match some output. Depending on what that output is, we can probably call the process taking place "prediction", "classification", or just "discovery" as follows:
A. Prediction. Prediction is when the output you have selected is the future value of something. It could be tomorrow's rainfall, next month's sales, the time it will take a pill to dissolve or the change in the Dow Jones Industrial Average in the next hour.
B. Discovery. Discovery is when the output you have chosen is not necessarily the future value of something, but you want to discover a formula based on some inputs. It could be discovery of a formula for the area of a circle based on radius, the flood level based on inches of rainfall, or the horsepower of an engine based on fuel, piston size, and other input factors.
C. Classification. The formula is doing classification when we seek to determine into which of several categories some data falls. It could be considered either prediction or discovery. ChaosHunter can only determine if the data is in one of two classes, such as whether a product is good or bad, whether the market will rise or fall tomorrow, whether it will rain or not this week, whether the project should be undertaken or not, etc. To do classification, the sample data we load into CH should have outputs that are either positive or negative, positive denoting one class, and negative denoting the other. So in the case of predicting whether the market will rise or fall tomorrow, the output would be the historical change in price the next day (measured in hindsight of course). If the change in price is positive, it is in the class "rise".
