Subject: Re: ML for MI
FC... First the GOOD news --- there are some intrinsic strengths in MLs

(1) Transformations - Not required. MLs figure out the topology themselves - ie literally you can throw just Price and Volume and the algo will learn it for you. EXCEPT of course if your idea is to FIND which ones - then you have to feed the transform as a feature itself.

That being said - Price and Volume in RAW formats itself and especially in combinations produce a degree of uniqueness which if fed in raw - the learner will remember by heart and not be able to generalize forward. So its important to provide it in some standardized formats ( Stochastics, MA diff, anything you have found useful...)

(2) Feature Importance ie subset selection : Not required also. Algo figures out for you - so more you throw at it the merrier. Although in practice I have personally found that if you follow a final step of variable redundancy between Development and Holdout - it tends to generalize better in Out-of-time ie Post-discovery simulation/testing

(3) However GOOD DATA is the key. Infact you are basically tethered/constrained by your data. But based on GTR1 capabilities - if someone knows how to manipulate it properly it should be possible to generate very good features for ML usage.

One of the simplest ideas could be an ML-SOS.

But again ML is not panacea ..... and infact if you consider and approach it as a "plebe" ( no offense) - your chances of success are fairly limited. I know there's a LOT out there about AI democratization - but they didnt really put fresh graduates on the Alpha-GO advisory team. Financial data is notoriously tough problem - you can easily land on misleadingly rosy results in backtest which will essentially lose you money in the forward.

My personal view would be for this group

(a) Try to band together a group and work on it both independently and together to cross-check ideas etc.
(b) Use MLs as a primary search tool - ie cull down the universe - you can simply eliminate a whole lot of stuff that doesn't work and then use the smaller set to build your own ideas

Best