News round-up for December 4th, 2022
Enter the Keras community prize ($9,000 in prizes!) by submitting a notebook or code repository that leverages KerasCV’s StableDiffusion implementation.
Check out the FeatureSpace utility in tf-nightly: a one-stop shop for tabular data preprocessing.
Beta-test the Keras v3 saving format in tf-nightly: this is the model representation format that will soon become the standard in the Keras ecosystem.
Weekly idea: The loop of progress
You develop interesting ideas not by being clever, but by thinking about things for a very long time, with obsessive relentlessness — by following every trail of thought, opening door after door, and not stopping until you reach a conclusion. The best ideas rarely spring from a sudden "Eureka" moment or from an ambitious brainstorming session. They're born from years of iterative refinement.
Everywhere you look, you see the same pattern. The most effective scientists aren't necessarily the smartest ones. They're the ones who tweaked their theories further than others, in close contact with experimental reality. The people who win machine learning competitions on Kaggle aren't those that started out with the cleverest approach, they're the ones who iterated on their approach more times, guided by a reliable evaluation process. The best startups didn't start out with product-market-fit on day one. They kept pivoting and adjusting at a fast pace before becoming an "overnight" success.
A version of this is even true for people: the more willing we are to reevaluate our views and admit mistakes, the faster we learn and the further we can improve.
Iterating on your ideas is a feedback loop, where the results of each cycle inform the next one. It features roughly three steps:
Step 1: Ideation. Come up with a new idea — or an improvement on an existing idea.
Step 2: Experiment. Run a test to evaluate the quality of your idea.
Step 3: Analysis. Ponder the data generated by your experiment, and update your idea.
The cycle then repeats, with new insights being used to inform the next round of experimentation. This loop of progress continues indefinitely, with each iteration building on the knowledge gained from the previous one. Your creations get better each time.
To make this concrete, let's look at a few particular cases. Science, for instance.
The loop of progress in science
Science is all about coming up with accurate models of reality. Accurate in the sense that they can produce predictions about future outcomes that will be verified in practice.
The loop of progress in the context of science is pretty much the scientific method. Start by formulating a theory — this is your "idea". Then design an experiment to evaluate the accuracy of your idea — typically you're going to want to invalidate your idea, because that's the quickest, highest-bandwidth form of intellectual feedback you can get. Run your experiment and collect data. Then analyze the data, and move on to the next iteration of your theory.
The loop of progress in software development
Imagine you're writing some code, and you hit a bug. The loop of progress kicks in.
You start with an initial mental model of what your code does. Like all models, that model is wrong — that's why there's a bug. You come up with an experiment to get feedback on your mental model — perhaps you'll write a test, or you'll add a print() statement somewhere to check whether the state of your program at that point actually matches what your mental model expects. Then you run your experiment — by literally running the code. You inspect the logs, and you update your mental model accordingly. No bug can withstand this loop – you just need to run through it enough times.
The loop of progress in product development
You start with a product idea. You come up with a minimal viable product designed to check your assumptions. You run it by a group of users. You collect their feedback and analyze what it means. Then back to the drawing board. The next attempt will be better.
Maximizing iteration speed
In order to get better ideas, create better products, publish better papers, you need to run more iterations through the loop of progress. And since you only have limited time available, that means you need to remove bottlenecks along the loop so you can move through it faster.
No matter what context you find yourself in — science, programming, product design, etc. — the fundamentals are the same. There's a template you can use.
Better experiment design: You want to come up with an experiment that will maximize how much you learn. The most effective way to do this is usually to try to prove yourself wrong. For most people, this is advice that's hard to follow — they desperately want to prove themselves right. The problem with that, is that experiments meant to validate your assumptions rather than invalidate them tend to teach you next to nothing. It may be emotionally satisfying, but it's entirely ineffective.
Faster experiment set up and execution: This one is simple — just use tools and infrastructure that minimize the time it takes to go from experiment design to practical results. For instance, if you're developing a new machine learning architecture, you might want to use Keras (so you implement your ideas in a few minutes) and train your model on TPU (so you get your results faster).
Better feedback: You want to be able to record and inspect everything about your experiment that can inform your mental model of what just happened. The quality of your next idea critically depends on the information generated by your last experiment, so better data collection and better data visualization can make a big difference. To continue our machine learning example: make sure to record as much data as possible at each epoch of model training, and display that data visually via TensorBoard. Perhaps use a tool like Weights & Biases to compare results across model runs, to facilitate new insights.
Lessons
Optimizing the loop of progress for speed and learning bandwidth is mostly an art, but there is some science to it. Ask yourself: what form does the loop of progress take in my field? What are the bottlenecks I could remove?
Take a close look at what you do to get feedback on your ideas. That is to say, your experiment design practices. Are you asking the right questions? The hardest questions?
Then consider how you implement your experiments and how you run them. Are there tools you could use to speed things up? Can you leverage automation to run more experiments?
Finally, think about the kind of data you're collecting and how you're analyzing it. Could you collect more information? Are you displaying it as a visual panopticon where insights will pop out, or are you looking at it piecewise, as if through a tiny keyhole?
Remember: the faster you can run through the loop of progress and the more information you get from each iteration, the more refinement cycles your ideas will go through, and the better they will become.