Hi, Shrewd!        Login  
Shrewd'm.com 
A merry & shrewd investing community
Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Post of the Week!
Search BRK.A
Shrewd'm.com Merry shrewd investors
Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Post of the Week!
Search BRK.A


Stocks A to Z / Stocks B / Berkshire Hathaway (BRK.A)
Unthreaded | Threaded | Whole Thread (40) |
Author: mungofitch 🐝🐝🐝🐝 SILVER
SHREWD
  😊 😞

Number: of 15068 
Subject: Re: ML for MI
Date: 06/28/2024 4:23 PM
Post New | Post Reply | Report Post | Recommend It!
No. of Recommendations: 9
Well I unintentionally stirred up a hornets nest. The two most esteemed old time members of the board mungofitch and zeelotes both seem to dismiss my ML attempt at stock prediction as a complete data mining folly.

I don't dismiss it.

Looking at the whole subject from a great distance, there are two things that using smarter tools might accomplish.
* Exploring unexamined "corners" of the parameter space and discovering previously undiscovered pockets that tend to lead to good performance.
* Adding more degrees of freedom in the models.

The first one is a very good thing, provided only that the out-of-sample validation is done enough, and well enough, prior to trusting real money to it. Same with any investment method.

But the second one is generally a Very Bad Thing. There are good reasons that we have always liked screens with very few tuning parameters: less to tune means less chance of overtuning to the training set. It's astounding what a model will memorize if you don't watch it like a hawk.

I think there's a very good chance that you might accomplish good things on the first bullet, and I don't dismiss that at all. My prior post was merely intended to caution about the second...compounded by the hidden problems of doing out-of-sample validation cleanly.

An anecdote: back in the 90's when my team was paid to do predictive modelling on very large data sets, we mostly used a whole lot of linear factors, and it worked. We souped it up by feeding that result into a neural net (not many layers back then), which polished the model to get a few extra percentage points of accuracy, which (as we understood the data set quite well) was quite a surprise to us, even though the benefit was modest. These things can work. Fortunately we had a lot of validation data; we generally held back around 20% for that.

Data mining is defined as the discovery of new, useful, non-obvious and predictive rules in old data. Go for it.

Jim

Post New | Post Reply | Report Post | Recommend It!
Print the post
Unthreaded | Threaded | Whole Thread (40) |


Announcements
Berkshire Hathaway FAQ
Contact Shrewd'm
Contact the developer of these message boards.

Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Followed Shrewds