Hi, Shrewd!        Login  
Shrewd'm.com 
A merry & shrewd investing community
Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Post of the Week! | How To Invest
Search BRK.A
Shrewd'm.com Merry shrewd investors
Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Post of the Week! | How To Invest
Search BRK.A


Stocks A to Z / Stocks B / Berkshire Hathaway (BRK.A)
Unthreaded | Threaded | Whole Thread (19) |
Author: whafa   😊 😞
Number: of 19824 
Subject: Re: OT: Bill Ackman
Date: 04/11/26 8:17 AM
Post New | Post Reply | Report Post | Recommend It!
No. of Recommendations: 6
Wow, that's impressive!!!

For some reason (mostly because it was possible, I think, and I couldn't believe how easy it was) I started systematically downloading everything from the first post on, starting mid-2010. They allowed anonymous http requests, exposed the API right on the URL, used a very predictable integer PK, and anyone who posted there with any regularity "grokked" how it worked.

There were all kinds of hacks to that system; do you remember when they botched an upgrade and accidentally allowed font and color changes? They spent hours chasing down all the violating posts. I still have a reference to the original "Breakfast!" post which hit BestOf in 24pt font (and consisted of one tiny pancake.) somewhere. They tried to delete that too, of course, but thanks to the bug that didn't check to see if a post was deleted before loading the reply page, which contained an entire copy of the OP, nothing was ever *really* deleted, just hidden. That was a fine time for me.

Anyway, I grabbed everything to date in 2010 -- ~1TB in ~18 mil .html files (it took months!) -- and it sat on an external hard drive for over a decade. These AI coding tools came out right as I was starting to worry that background radiation would scramble my precious files, giving me both motivation and means to do something with it. And using this non-trivial data processing project to build and validate the platform for my wife's research project was a perfect fit. Message board posts, genetic data, they're *basically* the same, right? :D

I do deeply regret not figuring out how to do authenticated requests in batch files, because you *did* have to be logged-in to post a reply, and so could not see any "deleted" posts anonymously. So I left all of those behind. I have a little side project where I search the Internet Archive for those deleted posts, hoping I can catch a snapshot after it was posted but before it was deleted. Can't say for sure that I've recovered any that way, but I *have* found many posts there. There's just not enough time in the day to look at everything.


Post New | Post Reply | Report Post | Recommend It!
Print the post
Unthreaded | Threaded | Whole Thread (19) |


Announcements
Berkshire Hathaway FAQ
Contact Shrewd'm
Contact the developer of these message boards.

Best Of BRK.A | Best Of | Favourites & Replies | All Boards | Followed Shrewds