My Mathematical Regression

(blog.dahl.dev)

131 points | by aleda145 3 days ago

22 comments

hiAndrewQuinn 2 minutes ago
The trick to making these problems intuitive is to mentally rewrite them into a "how many permutations of this string are possible?" problem. Consider the 2 * 3 case.
```
    ...
    ...
```
One way you might get there is
```
    Right, right, right, down, down
```
Then you can rewrite this as
```
    RRRDD
```
You will always need 3 R's, and you will always need 2 D's. So how many unique strings can be made with this?
Well let's actually consider the degenerate cases.
```
    ABCDE
```
there are 5 places A can go, then 4 left B can go, then 3 left C can go, and so on, until we get 5! = 120 possible permutations of ABCDE. If you replace the B with another A to get
```
    AACDE
```
now there are only 60 permutations, because half of the original 120 only differed by where the A and the B were relative to one another. By that same logic,
```
    AACCE
```
has only 30 combinations, and
```
    AACCC
```
has only 10 (seeing why it's 10 and not 20 is actually the trickiest part imo, it's because there are 3! ways to arrange CDE, but only 1 to arrange CCC).
AACCC is isomorphic to RRDDD, which is how we get 10 possible paths to solve the 2*3 grid. We can check this with the binomial theorem: ((2+3) choose 3) = 10.
jp57 2 hours ago
I think one of the saddest thing is that the kind of person who would recognize, "we can solve this seemingly complicated problem by just applying this formula", would often have trouble even getting recognized in many corporate environments.
I managed a guy like that. He was capable of very complex thinking, but he wasn't in love with complexity, he was in love with simplicity. His solutions tended to be of the form, "we can ignore all these things, and just focus on X, and it will provide all the value." He'd notice something and simplify it and the benefit to the company would be measured in multiples of his salary.
Every manager who'd ever directly managed him knew what a treasure he was, but it was often hard for us to convince others of the value of his solutions because they were so simple, and people were convinced that hard problems must have complex solutions. (or else they would have solved them, right?)
He eventually got bored. He retired and joined a seminary.
[-]
- abhaynayar 41 minutes ago
  If his monetary value to the company was as said why would any other metric like complexity even remotely matter or need convincing assuming the main goal of the company was to make money.
  Money would matter even more than the interpersonal stuff in most cases but on top of it even the managers treasured him so there should've been even less of an issue of communicating value.
  Getting bored is totally understandable though given his calibre but that's a separate issue from how the company evaluates performance.
  [-]
  - skydhash 19 minutes ago
    > even the managers treasured him so there should've been even less of an issue of communicating value.
    I’m not a fan of doing politicking, bjut after much courses on writing and communication, I strongly believe that such simple solutions could have been presented in a way to justify rewards.
    There are people that do nothing worthwhile and can find words to justify themselves. If someone brings value, you can find words to earn him recognition.
  - crabbone 11 minutes ago
    Oh, I've worked in more than a dozen of software companies. When it comes to planning activities and setting goals, I've rarely seen a lot of sense.
    I mean, we are in the industry where it used to be a standard practice, not so long ago, to deliver daily reports about one's activities while "planking", or throwing a beach ball to another person doing some silly acrobatics...
    It should come as no surprise that there's no rigorous assessment protocol for these kinds of things anywhere. Retrospectively, I will admit, that enormous amount of effort and resources are wasted due to bad planning. But it's still not done.
    I can imagine that with the field becoming more competitive, eventually, the industry specialists will come together and try to address the problem, but so far and for so long the resources just kept flowing in, the huge waste wasn't really a problem.
- xg15 2 hours ago
  I imagine this is where the reputation of a good manager comes in and the ability to say to their boss "hey, we should keep this guy... just trust me on this."
- thelonelyborg 40 minutes ago
  great story.
mb7733 2 hours ago
An intuitive motivation for the solution in the article (2n choose n). For an n*n grid you have to you will take 2n steps, n "over" and n "down". All that matters is the order of the steps. So if you think of there being 2n "slots", you have to pick n to be "over", and the rest are forced to be "down". So it's n choose 2n indeed.
You can also think of it another way, without using the formula combinations, and only the fact that there are n! permutations of n objects. We can think of this a permutation of 2n items, made up of two groups of n identical items each. Using (2n!) will overcount, due to the fact that each of the "over" steps are identical, and similarly for the "down" group. We have cut down our answer by dividing out all of the repeated sequences. There will be n! redundancies for all the ways we can permute the "over" group and, the same for the "down" group. So this results in (2n!) / (n! * n!), which is exactly equal to 2n choose n. See [1] which explains permutations with repetion this in general. [Note: We pretty much re-derived the formula for combinations!]
[1] https://brilliant.org/wiki/permutations-with-repetition/
gobdovan 35 minutes ago
This is a pre-AI phenomena. I observe it quite a lot with stuff I did in high school but usually with complex problems. What's generally happening is that you were working with pen and paper through a hard problem. With adult brain, you'd expect just to know the answers, but in reality you're not much smarter than you were at 14, so you need to do the thing properly.
Also if you help little kids with homework, you'll see that some problems are quite difficult as well and require you to actually think, even if it's problems for 10 year olds.
vm64 8 minutes ago
We taught this problem in my college’s discrete math course. The intuition we gave is that it’s exactly equivalent to the number of ways to rearrange a string of 20 Rs and 20 Ds (corresponding to a right and down move)
dhosek 1 hour ago
There’s a bit of hand-waving in the jump to 2n choose n solution, which I suppose is fine, and my ex–math teacher brain really wants to have a proper proof or at least solid reasoning rather than “it follows the pattern” based on three observations.
But I am reminded of how during my engagement 24 years ago, my future father-in-law raised an issue of being able to determine whether they were getting the full amount of sandpaper on large rolls that they were paying for. I was able to simplify the question a bit to one that treated the rolls as if they were simple concentric rolls of a specified thickness and from there could turn it into the good old Gaussian sum formula times 2π to get the length. The engineers working for the company came up with the same solution, but instead of using n(n-1)/2 they did the summation with multiple rows in excel.
[-]
- gobdovan 38 minutes ago
  You can go down or right at any point. To go in bottom-right corner, you need n down steps and n right steps. In how many ways can you arrange n things on type A and n things of type B? In C(2n, n). The problem is about modeling, once you model it correctly, you get the definition of combinations.
- shenberg 47 minutes ago
  You always either go left or down so total 40 steps, choose 20 to be down (or 20 to be right)
d_silin 1 hour ago
Nobody forces you to use AI.
It has become sort of junk food for the brain. Temptations and ads for it everywhere.
[-]
- bluefirebrand 1 hour ago
  If you have performance based metrics about your AI usage then you are essentially being forced to use AI (or become unemployed)
  Plenty of people are experiencing this nowadays
  The idea that no one is being forced to use AI is nonsense
  [-]
  - pocksuppet 1 hour ago
    That was in the first calendar quarter this year. In the second calendar quarter CEOs started saying AI is too fucking expensive and let's stop doing it.
  - d_silin 54 minutes ago
    There is no metric a reasonably intelligent person can't sabotage or subvert...
    ...for example, you can write a script to burn tokens and write the code yourself.
    [-]
    - degamad 35 minutes ago
      e.g. <https://github.com/dtnewman/burn-baby-burn>
- lezojeda 1 hour ago
  [dead]
andredurao 51 minutes ago
That Project Euler problem was my first encounter with memoization. At the time, it felt like a magical solution, so I ended up solving it using the central column of Pascal's triangle, which was easier for me to understand back then.
I also tried a weird idea involving popcount, but it didn't scale. My approach was to represent each possible path with 0s (don't turn) and 1s (turn), testing the same number of 0s and 1s. However, even with popcount running in O(1) with hardware support, the total number of possible paths made the idea impractical :)
the_red_mist 2 hours ago
Tbh your student reasoning is still dangerous... the patterns could have not generalized nicely. see moser's circle problem
needed to justify viewing this as "arranging down vs right movements" as another comment outlines
kccqzy 50 minutes ago
I was sad in a different way. I immediately realized that this could be solved by dynamic programming by computing the recurrence F(x,y)=F(x-1,y)+F(x,y-1) with the base case F(0,0)=1 and F(x,y)=0 if x<0 or y<0. The problem is that I immediately jumped to generating functions as a tool to solve this. I defined G(u,v)=\sum_x \sum_y F(x,y) u^x v^y. After maybe ten minutes of manipulation I arrived at the closed form for G(u,v)=1/(1-u-v). At this point I recognized its series expansion and its coefficients are just given by the binomial theorem.
I feel sad because I had forgotten the simple and intuitive construction of choosing “go down” and “go right” directions. When a person learns more advanced mathematics, it is often the case that the person just applies such advanced mathematics by rote without realizing that a solution can be found with more elementary mathematics and more creativity. It reminded me of the time in middle school before derivatives were taught, when my teacher reminded me that using derivatives to solve a problem would receive no credit.
purple-leafy 2 hours ago
Heh, this grid image is all too familiar to me right now.
I’m building a grid based game and engine, and I have a game replay format which is not video.
I hit a massive wall with compression, trying to compress unit pathing and was trying to solve a similar solution.
Given an NxN grid, and the 4 cardinal directions (NSEW) you can move in, plus an extra action that makes you move 2 cells instead of 1, and considering you can move 4 cells per second…
What’s the smallest worst-case raw compression artefact you can output for 1 player for a 1 minute game?
It’s an extremely fun problem to solve. I tried:
- encoding changes into bits eg using 2 bits for direction
- movement pattern batching (ie batching 2 moves into 3 bits)
- crowd patterns and movement prediction
- treating movement as a “projectile” and deriving intermediate states
And all sorts of other wild crap that I will write up about on game launch
[-]
- tux3 2 hours ago
  What a lot of games do is run a strictly deterministic simulation in lockstep. Then you don't save the path of every unit, you save one move command for the whole group. Then the game replays inputs, and the pathing algorithm should give the same result if there are no desyncs.
  [-]
  - purple-leafy 2 hours ago
    Yes you are definitely onto something! Love to see more people talking about deterministic games.
    My game is strictly deterministic, so I get bot movement for free - but the player has agency so I need to capture their deviations
    That’s the tricky part! Right now I do capture input (actually just deviations) and can replay whole games, but I think I’m at the limits in terms of compression - talking bytes here not KB
floppyd 2 hours ago
Noticing a pattern and just extending it without proving why it works is not really a solution. You can prove it without really "understanding" it using induction, but that still would be proof, same as just counting on a computer.
utopiah 1 hour ago
Comment as a song https://mariedavidson.bandcamp.com/album/work-it-soulwax-rem...
There is no easy way out, you have to rest but you simply can't stop. Your body will rot, your mind too.
PS: song isn't an ode to the grind culture or how to slave away in an office, as lyrics say "you’ve got to work for yourself - Love yourself, feed yourself".
johndough 12 minutes ago
Another approach that often works for these kinds of problems and does not require much intelligence:
Work out the first few cases by hand (1,2,6,20 in our case) and then look up the sequence on "The On-Line Encyclopedia of Integer Sequences" (OEIS):
https://oeis.org/search?q=1%2C2%2C6%2C20&language=english&go...
Mainan_Tagonist 2 hours ago
The more i think about math these days, the more i see it as a muscle one must constantly train to achieve its potential.
Give it too long a rest and you have to go back at full blast for weeks on end to hope to ever achieve past performance.
I am very bad at math and have always been in awe of those who can do it well.
[-]
krackers 1 hour ago
"Compute the first few terms and plug into OEIS" is very high on the reward:effort scale
aesthesia 1 hour ago
I've had the same experience looking back at solutions to old problem sets and wondering how I ever came up with them.
cocoto 2 hours ago
Even if you don’t know or remember the basics of combinatorics you can solve the problem with basic dynamic programming : start with the unit grid and then expend it.
rabbitlord 1 hour ago
More like mathematical depression.
aeve890 2 hours ago
Ha. When I found that problem I draw the grids and paths from the example, left for a coffee and when came back I just look at the drawings at an angle and thought "well this is just Pascal's triangle". And the solution was obvious.
ogogmad 2 hours ago
```
  me@localhost:~> bc
  d=1; for(i=21; i < 41; i++){d *= i;}; print d; print "\n";
  335367096786357081410764800000
  n = 1; for(i = 1; i < 21; i++){n *= i;}; print n; print "\n";
  2432902008176640000
  d/n;
  137846528820
```
I couldn't start Python for some reason, so I went 1337 and used BC, which comes preinstalled in every Unix-like OS. BC has a surprising advantage here since 40!/20! cannot be represented as a 64-bit integer since its value exceeds 2^64. That said, BC's stdlib does not provide the factorial function* - so I had to resort to using for-loops instead.
* - What it does contain is sine, cosine, exponential, log, arctan, and Bessel J (?!?!?!?!)
[-]
- qsort 2 hours ago
  You don't need space for 40!/20!, for example:
```
  let ans = 1
  for (let i=1; i<21; ++i) {
    ans *= (41 - i)
    ans /= i
  }
```
  The same idea can be trivially tweaked to compute any binomial coefficient without ever storing an integer greater than the final result.
  [-]
  - ogogmad 2 hours ago
    Good point. But what if `i` does not divide `ans` evenly? I suppose you could use floats and then round.
    [-]
    - qsort 1 hour ago
      It always divides it evenly, that's why it works.
      After the i-th iteration of the for loop, ans will contain n!/((n-i)!i!) which is exactly \binom{n}{i}, an integer.
      Technically "ans" can grow above the final result in my example, but even that could be fixed if one really wants (e.g. i must divide either ans or n-i, you play a bit with divmod to figure out which division you do first.)
- aesthesia 1 hour ago
  Just noting that Python natively handles integers larger than the machine word size since version 2.5, so this would have worked in Python as well.
  [-]
  - Qem 7 minutes ago
    And does it quite fast too:
    https://www.wilfred.me.uk/blog/2014/10/20/the-fastest-bigint...
  - ogogmad 1 hour ago
    I think BC's features are equivalent to:
    - Python's native integer handling, which already has no size limit.
    - PLUS part of the Decimal module in Python's stdlib: BC's floats are DECIMAL by default, not binary.
    - PLUS an implementation of Bessel's J function, while neglecting Bessel's K.
    - Some features for base conversion using `ibase` and `obase`. So, I suppose you can output numbers to base 60. [EDIT: Correction from earlier: ibase is allowed to be at most 16, while POSIX allows for the maximum value of obase to be at least 99, which therefore does allow for formatting output to base 60.]
jasonmp85 2 hours ago
[dead]