<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://squarism.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://squarism.com/" rel="alternate" type="text/html" /><updated>2026-02-20T14:41:19-08:00</updated><id>https://squarism.com/feed.xml</id><title type="html">SQUARISM</title><subtitle>I&apos;m from the valley of the heavy heads.</subtitle><entry><title type="html">The Nvidia Sorter from GDC 2022</title><link href="https://squarism.com/2026/02/20/nvidia_sorter_gdc_2022/" rel="alternate" type="text/html" title="The Nvidia Sorter from GDC 2022" /><published>2026-02-20T00:00:00-08:00</published><updated>2026-02-20T00:00:00-08:00</updated><id>https://squarism.com/2026/02/20/nvidia_sorter_gdc_2022</id><content type="html" xml:base="https://squarism.com/2026/02/20/nvidia_sorter_gdc_2022/"><![CDATA[<p><img alt="Nvidia sorter screenshot" style="width: 100%; margin: auto;" src="/uploads/2026/sorter.jpg" /></p>

<p>I saw this animation from <a href="https://www.youtube.com/watch?v=Uo8rs5YfIYY&amp;t=405s">NVIDIA’s 2022 GTC presentation</a> and I thought it was neat.  So, I set out to re-create it.</p>

<p><a href="https://squarism.com/nvidia_sorter/">You can see it here</a>.</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Keys</th>
      <th style="text-align: left"> </th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left">?</td>
      <td style="text-align: left">this keys list</td>
    </tr>
    <tr>
      <td style="text-align: left">g</td>
      <td style="text-align: left">the GPU overlay that the presentation fades in and out</td>
    </tr>
    <tr>
      <td style="text-align: left">space</td>
      <td style="text-align: left">pause, at least the belt movement</td>
    </tr>
    <tr>
      <td style="text-align: left">d</td>
      <td style="text-align: left">debug: destination markers</td>
    </tr>
    <tr>
      <td style="text-align: left">w</td>
      <td style="text-align: left">debug: wireframe of “the belt”</td>
    </tr>
  </tbody>
</table>

<p><a href="https://github.com/squarism/nvidia_sorter">Source is here</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I saw this animation from NVIDIA’s 2022 GTC presentation and I thought it was neat. So, I set out to re-create it. You can see it here. Keys   ? this keys list g the GPU overlay that the presentation fades in and out space pause, at least the belt movement d debug: destination markers w debug: wireframe of “the belt” Source is here.]]></summary></entry><entry><title type="html">A Network in Rust, Part 3</title><link href="https://squarism.com/2024/06/01/a_network_in_rust_part_3/" rel="alternate" type="text/html" title="A Network in Rust, Part 3" /><published>2024-06-01T00:00:00-07:00</published><updated>2024-06-01T00:00:00-07:00</updated><id>https://squarism.com/2024/06/01/a_network_in_rust_part_3</id><content type="html" xml:base="https://squarism.com/2024/06/01/a_network_in_rust_part_3/"><![CDATA[<p><img alt="Abstract Networking" style="width: 100%; margin: auto;" src="/uploads/2024/networking_abstract.jpg" /></p>

<p>We are in the home-stretch now.  Let’s show a few more details and then we’ll run the whole thing and show that it works.</p>

<p>This post is part 3 of a series.  We are learning networking by building a network.</p>

<ul>
  <li><a href="/2024/05/05/a_network_in_rust_part_1/">Part 1</a> - covered networking basics and implemented MAC addressing</li>
  <li><a href="/2024/05/17/a_network_in_rust_part_2/">Part 2</a> - implemented primitives like IP, ARP and an Interface</li>
</ul>

<h2 id="more-abstractions">More Abstractions</h2>

<p>In <a href="/2024/05/17/a_network_in_rust_part_2/">part 2</a>, we created an interface and previewed how we can make “a box” at the end of the post.  A box is just slang for a server, a node or a computer.  There are many ways we could represent a box, but one way would be for it to own the things that it owns in the real world.  Since we will not model an entire operating system here, this is just an approximation.</p>

<h3 id="the-server">The Server</h3>

<p>A box (or server) owns its hostname, has an interface and an ARP table.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">struct</span> <span class="n">Server</span> <span class="p">{</span>
  <span class="k">pub</span> <span class="n">hostname</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
  <span class="k">pub</span> <span class="n">interface</span><span class="p">:</span> <span class="n">Interface</span><span class="p">,</span> <span class="c1">// for now, one interface</span>
  <span class="n">routes</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="n">Route</span><span class="o">&gt;</span><span class="p">,</span>       <span class="c1">// routing table, not implemented</span>
  <span class="k">pub</span> <span class="n">arp_table</span><span class="p">:</span> <span class="nn">arp_cache</span><span class="p">::</span><span class="n">ArpCache</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It also owns its routing table as <code class="language-plaintext highlighter-rouge">routes</code>.  But the routes part of a server is not implemented here because we never made a router.  There is also a major gap in how an ICMP reply would be handled.  In the real world, there would be a network stack listening that would generate a response.  This is not implemented.</p>

<h3 id="the-switch">The Switch</h3>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">struct</span> <span class="n">Switch</span> <span class="p">{</span>
  <span class="n">ports</span><span class="p">:</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="nb">u8</span><span class="p">,</span> <span class="nn">port</span><span class="p">::</span><span class="n">Port</span><span class="o">&gt;</span><span class="p">,</span>
  <span class="n">link_lights</span><span class="p">:</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="nb">u8</span><span class="p">,</span> <span class="nb">bool</span><span class="o">&gt;</span><span class="p">,</span>
  <span class="n">cam_table</span><span class="p">:</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="n">MacAddress</span><span class="p">,</span> <span class="nb">u8</span><span class="o">&gt;</span><span class="p">,</span> <span class="c1">// MAC to port number</span>
<span class="p">}</span>
</code></pre></div></div>

<!-- more -->

<p>The interesting part of <a href="https://github.com/squarism/layer_three/blob/main/src/switch/mod.rs">the Switch</a> is the <code class="language-plaintext highlighter-rouge">cam_table</code>.  This functions the same as the <code class="language-plaintext highlighter-rouge">arp_table</code> that the <code class="language-plaintext highlighter-rouge">Server</code> has, except it tracks what ports have what MAC addresses.  When a real switch is turned on, it doesn’t have any knowledge of who is physically in what port.  This is going to be faked by a <code class="language-plaintext highlighter-rouge">plug_in_interface</code> function which <em>knows this</em> but on a real switch this information would be observed or be done by flooding all ports and watching who responds.</p>

<p>The switch contains ports.  This would be layer 1.  The cable and electrons are not modeled except for a message on <a href="https://github.com/squarism/layer_three/blob/main/src/switch/port.rs">the Port</a> which prints out a <code class="language-plaintext highlighter-rouge">"Sending frame out on port with MAC: {:02X?}"</code> line.  The simulation stops here.  To continue the simulation, we could model the cable or connection to the server and send the packet along with whatever is connected to this port.  But it doesn’t do that.</p>

<h2 id="the-main">The Main</h2>

<p>Time to run the simulation.  Let’s walk through <a href="https://github.com/squarism/layer_three/blob/main/src/main.rs">the main</a>.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// Let's say that a server `box1` pings `box2`.</span>

    <span class="c1">// We need to set up our network "by hand"</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">switch</span> <span class="o">=</span> <span class="nn">switch</span><span class="p">::</span><span class="nn">Switch</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>

    <span class="k">let</span> <span class="n">box1_interface</span> <span class="o">=</span> <span class="nn">server</span><span class="p">::</span><span class="nn">interface</span><span class="p">::</span><span class="nn">Interface</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span>
        <span class="s">"11:12:13:14:15:16"</span><span class="p">,</span>
        <span class="s">"192.168.0.1"</span><span class="nf">.to_owned</span><span class="p">(),</span>
        <span class="s">"255.255.0.0"</span><span class="nf">.to_owned</span><span class="p">(),</span>
        <span class="nb">None</span><span class="p">,</span>
    <span class="p">);</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">box1</span> <span class="o">=</span> <span class="nn">Server</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">"box1"</span><span class="nf">.to_owned</span><span class="p">(),</span> <span class="n">box1_interface</span><span class="nf">.clone</span><span class="p">());</span>

    <span class="n">switch</span><span class="nf">.plug_in_interface</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">box1</span><span class="py">.interface</span><span class="p">);</span>

    <span class="k">let</span> <span class="n">box2_interface</span> <span class="o">=</span> <span class="nn">server</span><span class="p">::</span><span class="nn">interface</span><span class="p">::</span><span class="nn">Interface</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span>
        <span class="s">"21:22:23:24:25:26"</span><span class="p">,</span>
        <span class="s">"192.168.0.2"</span><span class="nf">.to_owned</span><span class="p">(),</span>
        <span class="s">"255.255.0.0"</span><span class="nf">.to_owned</span><span class="p">(),</span>
        <span class="nb">None</span><span class="p">,</span>
    <span class="p">);</span>
    <span class="k">let</span> <span class="n">box2</span> <span class="o">=</span> <span class="nn">Server</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">"box2"</span><span class="nf">.to_owned</span><span class="p">(),</span> <span class="n">box2_interface</span><span class="nf">.clone</span><span class="p">());</span>

    <span class="n">switch</span><span class="nf">.plug_in_interface</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">box2</span><span class="py">.interface</span><span class="p">);</span>
</code></pre></div></div>

<p>This part above basically sets up our network.  This is like going to the computer store and buying two computers, one switch and some cables.  It also represents installing an operating system and assigning IPs.  Notice that our IP addressing and MAC names are still the same from Part 2:</p>

<blockquote>
  <p>box1 will have the IP address of 192.168.0.1 and the MAC of 11:12:13:14:15:16 <br />
box2 will have the IP address of 192.168.0.2 and the MAC of 21:22:23:24:25:26</p>
</blockquote>

<p>Notice <code class="language-plaintext highlighter-rouge">switch.plug_in_interface(2, &amp;box2.interface);</code> takes the port number as the first argument.  This is more readable in code with type hints on because the function hint would look like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">plug_in_interface</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">port_number</span><span class="p">:</span> <span class="nb">u8</span><span class="p">,</span> <span class="n">interface</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">Interface</span><span class="p">)</span>
</code></pre></div></div>

<p>And then later on in <code class="language-plaintext highlighter-rouge">plug_in_interface</code>, there is a <code class="language-plaintext highlighter-rouge">self.cam_table.insert(interface.mac, port_number);</code> call.  So, plugging in a port cheats by remembering the port number in the <code class="language-plaintext highlighter-rouge">cam_table</code>.</p>

<p>Then we have more <em>“cheating”</em> because we need to model a hosts file or DNS.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// box1 calls getbyhostname(box2) which is simulated here</span>
    <span class="k">let</span> <span class="n">hosts_file</span> <span class="o">=</span> <span class="nf">make_hosts_file</span><span class="p">();</span>
</code></pre></div></div>
<p>And each box’s IP is saved as variables <code class="language-plaintext highlighter-rouge">box1_host</code> and <code class="language-plaintext highlighter-rouge">box2_host</code> (not shown for brevity).  The function <code class="language-plaintext highlighter-rouge">make_hosts_file</code> is hardcoded as a helper function in main.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// box1's IP stack figures out that the request is local, not needed to forward to the gateway</span>
    <span class="k">let</span> <span class="n">local_lan</span> <span class="o">=</span> <span class="k">crate</span><span class="p">::</span><span class="nn">network</span><span class="p">::</span><span class="nn">ip</span><span class="p">::</span><span class="nf">same_subnet</span><span class="p">(</span>
        <span class="n">box1_interface</span><span class="py">.ip</span><span class="nf">.parse</span><span class="p">()</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"box1 ip is not an IP"</span><span class="p">),</span>
        <span class="n">box2_interface</span><span class="py">.ip</span><span class="nf">.parse</span><span class="p">()</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"box2 ip is not an IP"</span><span class="p">),</span>
        <span class="n">box1_interface</span><span class="py">.subnet_mask</span><span class="nf">.clone</span><span class="p">(),</span>
    <span class="p">);</span>

    <span class="k">if</span> <span class="o">!</span><span class="n">local_lan</span> <span class="p">{</span>
        <span class="c1">// TODO: routing and routers</span>
        <span class="nd">panic!</span><span class="p">(</span><span class="s">"Routing not implemented."</span><span class="p">);</span>
    <span class="p">}</span>
</code></pre></div></div>

<p>This part is basically IP routing as discussed in <a href="/2024/05/05/a_network_in_rust_part_1/">Part 1</a>.  Box 1 and 2 are on the same subnet so <code class="language-plaintext highlighter-rouge">local_lan</code> is always true in this demo.  This is here to exercise the IP routing feature but there is no other path at the moment.  What would happen next is to implement a router and send the IP packet to the router instead of going to layer 2.</p>

<p>We continue instead and make an ICMP packet from the hosts file resolved IPs.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// box1 crafts an ICMP echo request and IP packet</span>
    <span class="k">let</span> <span class="n">payload</span> <span class="o">=</span> <span class="s">"This is a ping, weee"</span><span class="p">;</span>
    <span class="k">let</span> <span class="n">icmp_packet</span> <span class="o">=</span> <span class="nn">icmp</span><span class="p">::</span><span class="nf">packet</span><span class="p">(</span><span class="n">box1_host</span><span class="py">.ip</span><span class="nf">.to_string</span><span class="p">(),</span> <span class="n">box2_host</span><span class="py">.ip</span><span class="nf">.to_string</span><span class="p">(),</span> <span class="n">payload</span><span class="p">);</span>
</code></pre></div></div>

<p>Then, we go to layer 2 and make an ethernet frame.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="c1">// Once the IP address of box2 is known, box1 checks its ARP cache.</span>
    <span class="c1">// now, this is a one-shot simulation program so we will setup this scenario but later we</span>
    <span class="c1">// might turn this into a long running or concurrent or GUI program where this is situation</span>
    <span class="c1">// is not pre-determined, but in the meantime ... box1 looks up the MAC address for box2</span>
    <span class="c1">// using ARP The ARP response is not found so it broadcasts an ARP who-has and gets a</span>
    <span class="c1">// response box1 adds the response to its ARP cache</span>
    <span class="k">let</span> <span class="n">dest_mac</span> <span class="o">=</span> <span class="n">box1</span>
        <span class="py">.arp_table</span>
        <span class="nf">.lookup</span><span class="p">(</span><span class="o">&amp;</span><span class="n">box2_host</span><span class="py">.ip</span><span class="p">)</span>
        <span class="nf">.expect</span><span class="p">(</span><span class="s">"The demo has gone south because box2 ARP resolution failed"</span><span class="p">);</span>

    <span class="k">let</span> <span class="n">ethernet_frame</span> <span class="o">=</span>
        <span class="nn">network</span><span class="p">::</span><span class="nn">ethernet</span><span class="p">::</span><span class="nf">build_ethernet</span><span class="p">(</span><span class="n">box1_interface</span><span class="py">.mac</span><span class="p">,</span> <span class="o">*</span><span class="n">dest_mac</span><span class="p">,</span> <span class="n">icmp_packet</span><span class="p">);</span>

    <span class="c1">// uncomment to see the packet and open in wireshark</span>
    <span class="c1">// write_pcap("ping.pcap", &amp;ethernet_frame);</span>

    <span class="c1">// the packet is sent over ethernet to the switch which has its own MAC table etc</span>
    <span class="n">switch</span><span class="nf">.forward_frame</span><span class="p">(</span><span class="n">ethernet_frame</span><span class="p">);</span>

    <span class="c1">// the entire process is unwound on box2 which will not be covered here for now</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Finally, the switch would handle layer 1 and the bits would arrive at box 2.</p>

<p>So, when we run this we get output like this.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Interface with MAC: "11:12:13:14:15:16" plugged into port: 1
Interface with MAC: "21:22:23:24:25:26" plugged into port: 2
Sending frame out on port with MAC: [21, 22, 23, 24, 25, 26]
Frame forwarded to MAC: "21:22:23:24:25:26"
</code></pre></div></div>

<p>But, there is something more interesting we can do.  We can dump the Ethernet frame and see if we really are doing <em>“networking”</em>.  In the main, there is a <code class="language-plaintext highlighter-rouge">write_pcap()</code> function that we can uncomment and it will dump a pcap file at the project root.</p>

<p>If you open this pcap file in Wireshark, it will parse correctly.  These are the bits that would hit the modeled ethernet switch, in other words, the cable as bits because this is the frame that is from the message “Sending frame out …” above.</p>

<p><img alt="The ICMP and IP packets in Wireshark" style="width: 100%; margin: auto;" src="/uploads/2024/wireshark_ping.png" /></p>

<p>You’ll notice there are two warnings in yellow:</p>

<ol>
  <li>It doesn’t like the MAC address we made up because it probably breaks spec.</li>
  <li>There isn’t an ICMP reply message seen.</li>
</ol>

<p>Pretty funny.  But otherwise, we see the payload <em>“This is a ping, weee”</em> in the right pane of Wireshark and our source and destination is all there and correct.  You can even see the different layers.  Ethernet is on top on the left pane, then Internet Protocol and finallly ICMP.  These are the networking layers stacked, like envelopes containing envelopes.</p>

<h2 id="the-network-is-real">The Network is Real</h2>

<p>So, that’s a network simulation in Rust.  What is fun to think about is that in Wireshark you can replay captured packets on a network.  You could create a <code class="language-plaintext highlighter-rouge">tun0</code> address and then send this packet out on a real network.  If you had a host with that MAC address and IP then it would be delievered.  But even if it didn’t deliver, something else would happen like the switch would ignore it or try to ARP broadcast.  Or you could change the simulation to be a MAC and an IP that you really do have.</p>

<p>Seeing our simulated ping in Wireshark is nice visual validation of the simulation we made. We’ve gone from theoretical concepts to tangible output, bridging the gap between code and real-world networking. This experiment has hopefully taken a bit of the mystery out of networking.  It did for me.  I  learned many things from doing all this.</p>

<ol>
  <li>I did not know how the CAM table worked.</li>
  <li>I had not thought about which pieces went where on a server.</li>
  <li>I had never used the Etherparse library before or generated a packet programmatically.</li>
</ol>

<p>So, that was fun and I’m glad I got around to making three posts about it.  Happy exploring and <a href="https://github.com/squarism/layer_three/">check out the repo</a> and make it your own or implement something else to learn it.  <a href="https://jvns.ca/blog/2023/05/12/introducing-implement-dns-in-a-weekend/">Learning by implementing is not my idea</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[We are in the home-stretch now. Let’s show a few more details and then we’ll run the whole thing and show that it works. This post is part 3 of a series. We are learning networking by building a network. Part 1 - covered networking basics and implemented MAC addressing Part 2 - implemented primitives like IP, ARP and an Interface More Abstractions In part 2, we created an interface and previewed how we can make “a box” at the end of the post. A box is just slang for a server, a node or a computer. There are many ways we could represent a box, but one way would be for it to own the things that it owns in the real world. Since we will not model an entire operating system here, this is just an approximation. The Server A box (or server) owns its hostname, has an interface and an ARP table. pub struct Server { pub hostname: String, pub interface: Interface, // for now, one interface routes: Vec&lt;Route&gt;, // routing table, not implemented pub arp_table: arp_cache::ArpCache, } It also owns its routing table as routes. But the routes part of a server is not implemented here because we never made a router. There is also a major gap in how an ICMP reply would be handled. In the real world, there would be a network stack listening that would generate a response. This is not implemented. The Switch pub struct Switch { ports: HashMap&lt;u8, port::Port&gt;, link_lights: HashMap&lt;u8, bool&gt;, cam_table: HashMap&lt;MacAddress, u8&gt;, // MAC to port number }]]></summary></entry><entry><title type="html">A Network in Rust, Part 2</title><link href="https://squarism.com/2024/05/17/a_network_in_rust_part_2/" rel="alternate" type="text/html" title="A Network in Rust, Part 2" /><published>2024-05-17T00:00:00-07:00</published><updated>2024-05-17T00:00:00-07:00</updated><id>https://squarism.com/2024/05/17/a_network_in_rust_part_2</id><content type="html" xml:base="https://squarism.com/2024/05/17/a_network_in_rust_part_2/"><![CDATA[<p><img alt="Abstract Networking" style="width: 100%; margin: auto;" src="/uploads/2024/networking_abstract.jpg" /></p>

<p>This post is part 2 of a series.  We are learning networking by building it.</p>

<ul>
  <li><a href="/2024/05/05/a_network_in_rust_part_1/">Part 1</a> - covered networking basics and implemented MAC addressing</li>
  <li><a href="/2024/06/01/a_network_in_rust_part_3/">Part 3</a> - finishes the abstractions and shows the whole thing working</li>
</ul>

<h2 id="more-primitives">More Primitives</h2>

<p>Let’s model an IP address next.  Luckily, Rust has one already as <a href="https://doc.rust-lang.org/std/net/struct.Ipv4Addr.html">std::net::Ipv4Addr</a> but it does not include all the functionality we need.  It is just the data represenation of an IP address.  We need what is routing.</p>

<p>I’m certain you’ve seen an IP address before.  The interesting thing about an IP address is that it usually goes together with a <a href="https://en.wikipedia.org/wiki/Subnet#Subnetting">subnet mask</a> and a gateway address.  These three things together tells the IP stack whether an address is local or not.  If it’s not local, it sends it to a router.  If it is local, it continues to send the message, usually to the local network.</p>

<p>In our simple example, we won’t implement a router but we still want to explore this network concept so we’ll implement it as a function even if it will inevitablly return <code class="language-plaintext highlighter-rouge">true</code>.  Figuring out if a network address is local or not is not something Rust can do with stdlib.  But the logic is pretty simple and can be expressed in 4 lines of Rust.</p>

<p>What we need to do to figure out if our IP message is local is look at the source IP, destination IP and subnet mask:</p>

<ol>
  <li>Take the source IP address and figure out the network address.  The network address is a bitmask of the subnet mask and the IP address.  This is why it’s called a subnet mask.  You just AND the bits up.  I explain this later.</li>
  <li>Take the destination IP address and do the same thing.</li>
  <li>Now you have two network addresses.  Source network address and destiation network address.</li>
  <li>Compare the two networks.  If they are the same, send it locally (ie: Ethernet).  If they aren’t, send the IP message to the router.</li>
</ol>

<h3 id="the-mask-in-subnet-mask">The Mask in Subnet Mask</h3>

<p>Let’s say we have this IP address and subnet mask.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>IP Address:  192.168.0.1
Subnet Mask: 255.255.255.0
</code></pre></div></div>

<p>We can explode this out into bits because each number is an 8-bit number in IP version 4.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>IP:   11000000.10101000.00000000.00000001
Mask: 11111111.11111111.11111111.00000000
</code></pre></div></div>

<p>If we logically AND these together (only print a 1 if both are 1) then we get this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>11000000.10101000.00000000.00000000
</code></pre></div></div>

<p>Which if you turn into decimal again is <code class="language-plaintext highlighter-rouge">192.168.0.0</code>.  This tells us what the network address is.  In other words, if we do the same for an IP address which is “next to” <code class="language-plaintext highlighter-rouge">192.168.0.1</code> like <code class="language-plaintext highlighter-rouge">192.168.0.42</code> then the network address is the same.  This is what the mask does.  It’s a bit mask by an bit-wise AND operation.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>IP Address:  192.168.0.1
Subnet Mask: 255.255.255.0
Network:     192.168.0.0

IP Address:  192.168.0.42
Subnet Mask: 255.255.255.0
Network:     192.168.0.0
</code></pre></div></div>

<p>So, you can see that the Network address for ` 192.168.0.1<code class="language-plaintext highlighter-rouge"> and </code> 192.168.0.42<code class="language-plaintext highlighter-rouge"> above are both the same </code>192.168.0.0`.  That means, don’t route it.  It’s on the same network, just send it locally (make an Ethernet message).  So, let’s create a function to do all this.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">net</span><span class="p">::</span><span class="n">Ipv4Addr</span><span class="p">;</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="nf">same_subnet</span><span class="p">(</span><span class="n">src</span><span class="p">:</span> <span class="n">Ipv4Addr</span><span class="p">,</span> <span class="n">dest</span><span class="p">:</span> <span class="n">Ipv4Addr</span><span class="p">,</span> <span class="n">subnet</span><span class="p">:</span> <span class="nb">String</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">bool</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">subnet_parsed</span><span class="p">:</span> <span class="n">Ipv4Addr</span> <span class="o">=</span> <span class="n">subnet</span><span class="nf">.parse</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">();</span>

    <span class="k">let</span> <span class="n">src_network</span> <span class="o">=</span> <span class="nf">ipv4_to_u32</span><span class="p">(</span><span class="n">src</span><span class="nf">.octets</span><span class="p">())</span> <span class="o">&amp;</span> <span class="nf">ipv4_to_u32</span><span class="p">(</span><span class="n">subnet_parsed</span><span class="nf">.octets</span><span class="p">());</span>
    <span class="k">let</span> <span class="n">dest_network</span> <span class="o">=</span> <span class="nf">ipv4_to_u32</span><span class="p">(</span><span class="n">dest</span><span class="nf">.octets</span><span class="p">())</span> <span class="o">&amp;</span> <span class="nf">ipv4_to_u32</span><span class="p">(</span><span class="n">subnet_parsed</span><span class="nf">.octets</span><span class="p">());</span>

    <span class="n">src_network</span> <span class="o">==</span> <span class="n">dest_network</span>
<span class="p">}</span>

<span class="c1">// to_bits is nightly experimental on Ipv4Addr so we have to do it ourselves</span>
<span class="k">fn</span> <span class="nf">ipv4_to_u32</span><span class="p">(</span><span class="n">octets</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">4</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nb">u32</span> <span class="p">{</span>
    <span class="p">((</span><span class="n">octets</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">24</span><span class="p">)</span>
        <span class="p">|</span> <span class="p">((</span><span class="n">octets</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">16</span><span class="p">)</span>
        <span class="p">|</span> <span class="p">((</span><span class="n">octets</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">8</span><span class="p">)</span>
        <span class="p">|</span> <span class="p">(</span><span class="n">octets</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="k">as</span> <span class="nb">u32</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<!-- more -->

<p>So, part of IP routing is this network matching which the entire internet uses although there are many other details and specifics.  But for now, know that the answer to</p>

<blockquote>
  <p>Are the IPs 192.168.0.1 and 192.168.0.42 on the same network?</p>
</blockquote>

<p>Yes, they are both on the same network.  And we can see this behavior in a test.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[test]</span>
<span class="k">fn</span> <span class="nf">test_subnet_routing</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">src</span> <span class="o">=</span> <span class="nn">Ipv4Addr</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="mi">192</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
    <span class="k">let</span> <span class="n">dest</span> <span class="o">=</span> <span class="nn">Ipv4Addr</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="mi">192</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">);</span>
    <span class="k">let</span> <span class="n">subnet</span> <span class="o">=</span> <span class="s">"255.255.0.0"</span><span class="nf">.to_owned</span><span class="p">();</span>

    <span class="k">let</span> <span class="n">result</span> <span class="o">=</span> <span class="nf">same_subnet</span><span class="p">(</span><span class="n">src</span><span class="p">,</span> <span class="n">dest</span><span class="p">,</span> <span class="n">subnet</span><span class="p">);</span>
    <span class="nd">assert!</span><span class="p">(</span><span class="n">result</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Because they are both on the same network, it means we don’t have to forward it to another IP device like a router.  We can just send it locally (which means the next part).</p>

<h2 id="arp">ARP</h2>

<p>IP is a protocol used for routing traffic over the internet and on local area networks.  Similarly, there is another protocol which discovers hosts on a local network called <a href="https://en.wikipedia.org/wiki/Address_Resolution_Protocol">ARP</a>.  It’s job is to discover what MAC addresses go with what IPs.  When we send an IP message, that’s not enough.  Remember in in <a href="/2024/05/05/a_network_in_rust_part_1/">Part 1</a>, we talked about how messages go up and down abstraction layers.  So ARP is sort of connecting Layer 3 and Layer 2 because it connects IP addresses and hardware addresses which (in wired networks) lead us to ports on a switch and eventually electrons on a wire.</p>

<p>Simulating a real ARP request would be complicated because it’s usually built into the operating system or network stack.  For this simulation, we’re going to hardcode responses like this.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// src/network/arp.rs</span>
<span class="k">impl</span> <span class="n">Arp</span> <span class="p">{</span>
    <span class="c1">// Simulating a broadcast, not accurate or realistic</span>
    <span class="c1">// In a real scenario, this would involve network communication</span>
    <span class="k">pub</span> <span class="k">fn</span> <span class="nf">broadcast_arp_request</span><span class="p">(</span><span class="n">ip_address</span><span class="p">:</span> <span class="n">IpAddr</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="n">MacAddress</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="k">match</span> <span class="n">ip_address</span><span class="nf">.to_string</span><span class="p">()</span><span class="nf">.as_str</span><span class="p">()</span> <span class="p">{</span>
            <span class="s">"192.168.0.1"</span> <span class="k">=&gt;</span> <span class="nf">Some</span><span class="p">([</span><span class="mi">0x11</span><span class="p">,</span> <span class="mi">0x12</span><span class="p">,</span> <span class="mi">0x13</span><span class="p">,</span> <span class="mi">0x14</span><span class="p">,</span> <span class="mi">0x15</span><span class="p">,</span> <span class="mi">0x16</span><span class="p">]),</span>
            <span class="s">"192.168.0.2"</span> <span class="k">=&gt;</span> <span class="nf">Some</span><span class="p">([</span><span class="mi">0x21</span><span class="p">,</span> <span class="mi">0x22</span><span class="p">,</span> <span class="mi">0x23</span><span class="p">,</span> <span class="mi">0x24</span><span class="p">,</span> <span class="mi">0x25</span><span class="p">,</span> <span class="mi">0x26</span><span class="p">]),</span>
            <span class="n">_</span> <span class="k">=&gt;</span> <span class="nb">None</span><span class="p">,</span> <span class="c1">// do nothing and like it</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>For the remainder of this simulation, we’ll use a convention where box1 is all about the number 1.  It has an IP with 1 and MAC hexadecimal that starts with 1 and eventually it will be plugged into port 1 on a network switch we will build.  Eventually, we will observe success by seeing a real network message get passed and this convention will help with reading.  This will be shown in Part 3.</p>

<blockquote>
  <p>box1 will have the IP address of 192.168.0.1 and the MAC of 11:12:13:14:15:16 <br />
box2 will have the IP address of 192.168.0.2 and the MAC of 21:22:23:24:25:26</p>
</blockquote>

<p>In a real network, the ARP request would be a real-time event and simulating this would involve threads and concurrency which is not the point of this practice.  So, ARP gives us a MAC address.  What we do next, is make an Ethernet message that uses this MAC address.</p>

<h2 id="ethernet">Ethernet</h2>

<p>The <a href="https://docs.rs/etherparse/latest/etherparse/">etherparse</a> library has a packet builder which is very convenient but for this demonstration is unfortunately reversed in a particular way.  What we are trying to demonstrate is going through the layers in order of the OSI model.  But etherparse’s library does not allow us to do this.</p>

<p>Just as an example, this is how etherparse lets you create an Ethernet packet (frame) using their <code class="language-plaintext highlighter-rouge">PacketBuilder</code>.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">fn</span> <span class="nf">packet</span><span class="p">(</span><span class="n">src_mac</span><span class="p">:</span> <span class="n">MacAddress</span><span class="p">,</span> <span class="n">dest_mac</span><span class="p">:</span> <span class="n">MacAddress</span><span class="p">,</span> <span class="n">payload</span><span class="p">:</span> <span class="o">&amp;</span><span class="nb">str</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">u8</span><span class="o">&gt;</span> <span class="p">{</span>
<span class="k">let</span> <span class="n">builder</span> <span class="o">=</span> <span class="nn">PacketBuilder</span><span class="p">::</span><span class="nf">ethernet2</span><span class="p">(</span><span class="n">src_mac</span><span class="p">,</span> <span class="n">dest_mac</span><span class="p">)</span>
    <span class="nf">.ipv4</span><span class="p">([</span><span class="mi">192</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">10</span><span class="p">],</span> <span class="p">[</span><span class="mi">192</span><span class="p">,</span> <span class="mi">168</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">20</span><span class="p">],</span> <span class="mi">42</span><span class="p">)</span>
    <span class="nf">.icmpv4_echo_request</span><span class="p">(</span><span class="mi">123</span><span class="p">,</span> <span class="mi">456</span><span class="p">);</span>
</code></pre></div></div>

<p>We are starting with an IP packet and trying to make an ethernet packet.  But etherparse doesn’t let us split up these method chains.  So, unfortunately, we have to pass Vectors of bytes around.  Fortunately, this is more accurate to the real world.  Thinking of data flowing through the networking layers as binary is probably mostly accurate.</p>

<p>So, etherparse is reversed from what we want and coupled.  Etherparse does <code class="language-plaintext highlighter-rouge">::ethernet2().ipv4()</code> and we want <code class="language-plaintext highlighter-rouge">::ipv4().ethernet2()</code> basically.  And it’s coupled because you can’t break these apart.  It’s just how Etherparse’s library works.  In a real project, it wouldn’t matter.  But for explaining the layers and learning, it’s just unfortunate.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// src/network/ethernet.rs</span>

<span class="k">use</span> <span class="nn">etherparse</span><span class="p">::</span><span class="n">Ethernet2Header</span><span class="p">;</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="nf">build_ethernet</span><span class="p">(</span><span class="n">source</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">6</span><span class="p">],</span> <span class="n">destination</span><span class="p">:</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">6</span><span class="p">],</span> <span class="n">payload</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">u8</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Vec</span><span class="o">&lt;</span><span class="nb">u8</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nn">Vec</span><span class="p">::</span><span class="o">&lt;</span><span class="nb">u8</span><span class="o">&gt;</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>

    <span class="k">let</span> <span class="n">header</span> <span class="o">=</span> <span class="n">Ethernet2Header</span> <span class="p">{</span>
        <span class="n">source</span><span class="p">,</span>
        <span class="n">destination</span><span class="p">,</span>
        <span class="n">ether_type</span><span class="p">:</span> <span class="nn">etherparse</span><span class="p">::</span><span class="nn">EtherType</span><span class="p">::</span><span class="n">IPV4</span><span class="p">,</span>
    <span class="p">};</span>

    <span class="n">header</span>
        <span class="nf">.write</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">buffer</span><span class="p">)</span>
        <span class="nf">.expect</span><span class="p">(</span><span class="s">"Failed to write Ethernet frame"</span><span class="p">);</span>
    <span class="n">buffer</span><span class="nf">.extend</span><span class="p">(</span><span class="n">payload</span><span class="p">);</span>

    <span class="n">buffer</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There is our ethernet message.  It’s a Vector of bytes and etherparse handles the bit offsets for us.  Let’s finally create a network interface that can contain an IP address and a MAC address and send this Ethernet message.</p>

<h2 id="an-interface">An Interface</h2>

<p>This is modeling a conceptual network interface.  As we saw in Part 1, commands like <code class="language-plaintext highlighter-rouge">ip addr</code> and <code class="language-plaintext highlighter-rouge">ifconfig</code> on Linux can show us network interface information.  But this is too much to implement.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// src/server/interface.rs</span>
<span class="k">use</span> <span class="k">crate</span><span class="p">::</span><span class="nn">mac</span><span class="p">::{</span><span class="k">self</span><span class="p">,</span> <span class="n">MacAddress</span><span class="p">};</span>

<span class="nd">#[derive(Clone)]</span>
<span class="k">pub</span> <span class="k">struct</span> <span class="n">Interface</span> <span class="p">{</span>
    <span class="k">pub</span> <span class="n">mac</span><span class="p">:</span> <span class="n">MacAddress</span><span class="p">,</span>
    <span class="k">pub</span> <span class="n">ip</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
    <span class="k">pub</span> <span class="n">subnet_mask</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
    <span class="k">pub</span> <span class="n">gateway</span><span class="p">:</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">Interface</span> <span class="p">{</span>
    <span class="k">pub</span> <span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">mac</span><span class="p">:</span> <span class="o">&amp;</span><span class="nb">str</span><span class="p">,</span> <span class="n">ip</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span> <span class="n">subnet_mask</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span> <span class="n">gateway</span><span class="p">:</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">String</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Interface</span> <span class="p">{</span>
        <span class="n">Interface</span> <span class="p">{</span>
            <span class="n">mac</span><span class="p">:</span> <span class="nn">mac</span><span class="p">::</span><span class="nf">parse_mac_address</span><span class="p">(</span><span class="n">mac</span><span class="p">),</span>
            <span class="n">ip</span><span class="p">,</span>
            <span class="n">subnet_mask</span><span class="p">,</span>
            <span class="n">gateway</span><span class="p">,</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="a-preview-of-putting-this-together">A Preview of putting this together</h2>

<p>In the next part, we’ll finish up with a server, a switch and a <code class="language-plaintext highlighter-rouge">main.rs</code> that runs the entire thing.</p>

<p>So, for example when we are setting up <code class="language-plaintext highlighter-rouge">box1</code> in our <code class="language-plaintext highlighter-rouge">main.rs</code>, we can do this.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">box1_interface</span> <span class="o">=</span> <span class="nn">server</span><span class="p">::</span><span class="nn">interface</span><span class="p">::</span><span class="nn">Interface</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span>
    <span class="s">"11:12:13:14:15:16"</span><span class="p">,</span>
    <span class="s">"192.168.0.1"</span><span class="nf">.to_owned</span><span class="p">(),</span>
    <span class="s">"255.255.0.0"</span><span class="nf">.to_owned</span><span class="p">(),</span>
    <span class="nb">None</span><span class="p">,</span>
<span class="p">);</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">box1</span> <span class="o">=</span> <span class="nn">Server</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">"box1"</span><span class="nf">.to_owned</span><span class="p">(),</span> <span class="n">box1_interface</span><span class="nf">.clone</span><span class="p">());</span>
</code></pre></div></div>

<p>And we will “plug it in” to a switch on Port 1 that we haven’t created yet.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">switch</span><span class="nf">.plug_in_interface</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">box1</span><span class="py">.interface</span><span class="p">);</span>
</code></pre></div></div>

<p>See you in the next post.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[This post is part 2 of a series. We are learning networking by building it. Part 1 - covered networking basics and implemented MAC addressing Part 3 - finishes the abstractions and shows the whole thing working More Primitives Let’s model an IP address next. Luckily, Rust has one already as std::net::Ipv4Addr but it does not include all the functionality we need. It is just the data represenation of an IP address. We need what is routing. I’m certain you’ve seen an IP address before. The interesting thing about an IP address is that it usually goes together with a subnet mask and a gateway address. These three things together tells the IP stack whether an address is local or not. If it’s not local, it sends it to a router. If it is local, it continues to send the message, usually to the local network. In our simple example, we won’t implement a router but we still want to explore this network concept so we’ll implement it as a function even if it will inevitablly return true. Figuring out if a network address is local or not is not something Rust can do with stdlib. But the logic is pretty simple and can be expressed in 4 lines of Rust. What we need to do to figure out if our IP message is local is look at the source IP, destination IP and subnet mask: Take the source IP address and figure out the network address. The network address is a bitmask of the subnet mask and the IP address. This is why it’s called a subnet mask. You just AND the bits up. I explain this later. Take the destination IP address and do the same thing. Now you have two network addresses. Source network address and destiation network address. Compare the two networks. If they are the same, send it locally (ie: Ethernet). If they aren’t, send the IP message to the router. The Mask in Subnet Mask Let’s say we have this IP address and subnet mask. IP Address: 192.168.0.1 Subnet Mask: 255.255.255.0 We can explode this out into bits because each number is an 8-bit number in IP version 4. IP: 11000000.10101000.00000000.00000001 Mask: 11111111.11111111.11111111.00000000 If we logically AND these together (only print a 1 if both are 1) then we get this: 11000000.10101000.00000000.00000000 Which if you turn into decimal again is 192.168.0.0. This tells us what the network address is. In other words, if we do the same for an IP address which is “next to” 192.168.0.1 like 192.168.0.42 then the network address is the same. This is what the mask does. It’s a bit mask by an bit-wise AND operation. IP Address: 192.168.0.1 Subnet Mask: 255.255.255.0 Network: 192.168.0.0 IP Address: 192.168.0.42 Subnet Mask: 255.255.255.0 Network: 192.168.0.0 So, you can see that the Network address for ` 192.168.0.1 and 192.168.0.42 above are both the same 192.168.0.0`. That means, don’t route it. It’s on the same network, just send it locally (make an Ethernet message). So, let’s create a function to do all this. use std::net::Ipv4Addr; pub fn same_subnet(src: Ipv4Addr, dest: Ipv4Addr, subnet: String) -&gt; bool { let subnet_parsed: Ipv4Addr = subnet.parse().unwrap(); let src_network = ipv4_to_u32(src.octets()) &amp; ipv4_to_u32(subnet_parsed.octets()); let dest_network = ipv4_to_u32(dest.octets()) &amp; ipv4_to_u32(subnet_parsed.octets()); src_network == dest_network } // to_bits is nightly experimental on Ipv4Addr so we have to do it ourselves fn ipv4_to_u32(octets: [u8; 4]) -&gt; u32 { ((octets[0] as u32) &lt;&lt; 24) | ((octets[1] as u32) &lt;&lt; 16) | ((octets[2] as u32) &lt;&lt; 8) | (octets[3] as u32) }]]></summary></entry><entry><title type="html">A Network in Rust, Part 1</title><link href="https://squarism.com/2024/05/05/a_network_in_rust_part_1/" rel="alternate" type="text/html" title="A Network in Rust, Part 1" /><published>2024-05-05T00:00:00-07:00</published><updated>2024-05-05T00:00:00-07:00</updated><id>https://squarism.com/2024/05/05/a_network_in_rust_part_1</id><content type="html" xml:base="https://squarism.com/2024/05/05/a_network_in_rust_part_1/"><![CDATA[<p><img alt="Abstract Networking" style="width: 100%; margin: auto;" src="/uploads/2024/networking_abstract.jpg" /></p>

<p>Let’s take the magic out of networking, shall we?</p>

<p>First things first, let’s establish some ground rules:</p>

<ol>
  <li>Our aim is to provide an abstraction level akin to that experienced by a Linux server or networking device user, not delving into the intricacies of network engineering.</li>
  <li>We’ll avoid nitty-gritty details of bits, electrons, and cables.  After all, when people say they don’t grasp networking, they’re usually not referring to the physics behind it.</li>
  <li>Our model won’t operate within an event loop.  While scripting or CLI tools are fantastic, they’re inherently limited by their sequential nature.  Real-world networking operates asynchronously, akin to a bustling city where events occur in parallel.  To replicate this, we would need to veer into text-mode game territory, and I don’t want to do this.  We can learn about networking without making a long-running program but some things will need to be faked.</li>
</ol>

<p>Now that we’ve set some boundaries, let’s name some exciting topics ahead.  We will explore bit-math, frames, packets, understand different abstraction layers, make a binary file that can <em>actually</em>  be read by <a href="https://www.wireshark.org/">Wireshark</a> and create a working local area network (LAN) using real specifications.</p>

<p>This is a series of posts exploring networking by building things:</p>

<ul>
  <li><a href="/2024/05/17/a_network_in_rust_part_2/">Part 2</a> - implements primitives like IP, ARP and an Interface</li>
  <li><a href="/2024/06/01/a_network_in_rust_part_3/">Part 3</a> - finishes the abstractions and shows the whole thing working</li>
</ul>

<p>For more in-depth background, here is <a href="https://www.youtube.com/playlist?list=PLDQaRcbiSnqF5U8ffMgZzS7fq1rHUI3Q8">a nice video series</a> but I encourage you to replicate my work and build a network yourself in Rust or some other language.  You’ll <a href="https://jvns.ca/blog/2023/05/12/introducing-implement-dns-in-a-weekend/">learn more by doing</a>.</p>

<h2 id="approximately-how-networking-works">Approximately How Networking Works</h2>

<p>Networking works in abstraction layers from the point of view of the OSI model.  From the bottom up, it goes (1) Physical, (2) Data Link and then (3) Network layer.  It continues to <a href="https://en.wikipedia.org/wiki/OSI_model">higher abstraction layers</a> but our simulation and this blog post series will stop at layer 3.</p>

<p><img alt="Network Abstration Layers" style="width: 60%; margin: auto;" src="/uploads/2024/osi.png" /></p>

<p>Imagine a server named <code class="language-plaintext highlighter-rouge">box1</code> pings <code class="language-plaintext highlighter-rouge">box2</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>user@box1 $ ping box2
</code></pre></div></div>

<p>The ICMP program <code class="language-plaintext highlighter-rouge">ping</code> is way up at layer 7 (not pictured below) and sends information down the left side of the picture below <code class="language-plaintext highlighter-rouge">ICMP client -&gt; IP -&gt; Ethernet -&gt; Switch Port 1</code>.  Eventually it turns into bits, electricity that flows across the switch through cables (not pictured) and then hits box2’s stack where it flows back up right side.  This flow up the right side of the picture is shown below as <code class="language-plaintext highlighter-rouge">Switch Port 4 -&gt; Ethernet -&gt; IP -&gt; ICMP client</code>.  <em>(The ICMP responder would really a part of the network stack on box2)</em>.  So notice that the flow down the left side and the flow up the right side are opposite and reversed.</p>

<p><img alt="Network Abstration Layers" style="width: 100%; margin: auto;" src="/uploads/2024/layer_three_layers.png" /></p>

<p>We will talk about all the details of all of this.  But here are key takeaways at this point:</p>

<ul>
  <li>Details are hidden by the abstraction layers.</li>
  <li>The layers are different.  The physical layer could be <a href="https://en.wikipedia.org/wiki/Ethernet">copper wires</a>, wifi, <a href="https://en.wikipedia.org/wiki/IP_over_Avian_Carriers">pigeons</a> or <a href="https://en.wikipedia.org/wiki/Fiber-optic_communication">glass</a>.  The other layers don’t know or care.  Ping works the same over wifi or wires.</li>
  <li>Each layer has its own concerns (non-leaky abstraction).</li>
</ul>

<!-- more -->

<h2 id="examples-in-the-world">Examples in the World</h2>

<p>You’ve probably seen some of these concepts in terms of things you interact with like <code class="language-plaintext highlighter-rouge">ip addr</code> or <code class="language-plaintext highlighter-rouge">ifconfig</code> or a network cable.  So, the point of this simulation is to try to model things around these common interactions and see if we can <em>fake ping</em> box2.</p>

<p><img alt="Network Interactions" style="width: 100%; margin: auto;" src="/uploads/2024/networking_interactions.jpg" /></p>

<p>Take a look at the <code class="language-plaintext highlighter-rouge"># ifconfig eno0</code> output above.  This command has been replaced in many Linux distributtions with <code class="language-plaintext highlighter-rouge">ip addr</code> but the output is basically the same.  The IP address is listed as <code class="language-plaintext highlighter-rouge">inet 84.16.226.173</code>.  The netmask is <code class="language-plaintext highlighter-rouge">255.255.255.0</code> and the MAC address is under <code class="language-plaintext highlighter-rouge">ether 00:9c:02:9b:fd:9c</code>.  The output for <code class="language-plaintext highlighter-rouge">ip addr</code> will be slightly different.</p>

<p>So in terms of modeling and the previous layer diagram, we will probably need at least the following nouns/objects/concepts/structs:</p>

<ol>
  <li>A server called <code class="language-plaintext highlighter-rouge">box1</code> with an IP address.  It could be <code class="language-plaintext highlighter-rouge">84.16.226.173</code> but we will use a more common private IP address of <code class="language-plaintext highlighter-rouge">192.168.0.1</code>.</li>
  <li>An IP address.  Something that represents an IP.  We won’t rebuild <code class="language-plaintext highlighter-rouge">ip addr</code> or <code class="language-plaintext highlighter-rouge">ifconfig</code> so we’ll need to make an IP address <em>“thing”</em> and a subnet mask.</li>
  <li>An Ethernet device.  Notice that in the ifconfig screenshot the interface is called <code class="language-plaintext highlighter-rouge">eno0</code>.  The device name is arbitrary depending on your device driver and hardware model.  We’re not going to model device driver details, all we need is an interface with a MAC address.  The MAC address is burned onto a network card at manufacturing time.  The MAC’s job is to locally identify an interface (sometimes called a NIC) on a local area network (LAN).</li>
  <li>You’ve seen Ethernet cables.  We won’t end up modeling cables but you’ve probably noticed the plug on a switch or a home router.  You’ve also probably noticed that when you plug in a port a light turns on.</li>
</ol>

<p>Notice that we can sort the things we have named so far into the same abstraction layers.</p>

<ul>
  <li>Layer 1 / Physical - Cables, ports on a switch, pins, bits, electricity</li>
  <li>Layer 2 / Data Link - Ethernet, MAC addresses, a link light</li>
  <li>Layer 3 / Network - IP address, subnet mask, the ICMP protocol which ping uses</li>
  <li>Layer 7 / Application - A program named <code class="language-plaintext highlighter-rouge">ping</code> in Linux and Windows</li>
</ul>

<p>Notice that we skipped nnumbers 4, 5 and 6.  This is on purpose because these layers do not have concepts we have named.  We can just ignore them for now for simplicity.  We are going to focus on Layers 2 &amp; 3 while faking Layer 1.</p>

<h2 id="starting-our-model">Starting our Model</h2>

<p>So, with these concepts named and sorted, we can start thinking about how to implement this.  There are many ways to implement all of this.  First, I arbitrarily selected Rust because it’s my current low-level, CLI type of language.  It’s also nice to have access to bytes and tooling that is lower down in abstractions.  Specifically, we’ll be using a library called <a href="https://docs.rs/etherparse/latest/etherparse/">etherparse</a> to avoid us having to define binary bit offsets from the specs and things like that.</p>

<p>First, let’s talk about the end-to-end flow of what happens with ping and then we’ll discover some things.  When <code class="language-plaintext highlighter-rouge">user@box1 $ ping box2</code> executes, this is approximately what happens:</p>

<ol>
  <li>Before anything happens, we had already created our network.  IE: bought two servers, a switch and some cables.</li>
  <li>We plugged in box1 into port1 and box2 into port2 on an ethernet switch.</li>
  <li>The interfaces came with a MAC burned in at manufacturing time.  <code class="language-plaintext highlighter-rouge">box1</code> has <code class="language-plaintext highlighter-rouge">11:12:13:14:15:16</code>.  Each hex for box1 starts with <code class="language-plaintext highlighter-rouge">1</code> for ease of reading for the reader.</li>
  <li>We assigned <code class="language-plaintext highlighter-rouge">192.168.0.1</code> to box1.  It ends with <code class="language-plaintext highlighter-rouge">.1</code> for ease of reading for the reader (and myself).</li>
  <li>We did the same for <code class="language-plaintext highlighter-rouge">box2</code> with <code class="language-plaintext highlighter-rouge">192.168.0.2</code> and the MAC <code class="language-plaintext highlighter-rouge">21:22:23:24:25:26</code>.</li>
</ol>

<p>When ping fires,</p>

<ol>
  <li>The program <code class="language-plaintext highlighter-rouge">ping</code> creates an ICMP packet and tells the network stack to make an ICMP packet with the destination of box2.  This is layer 3.</li>
  <li>The ICMP packet destiation of <code class="language-plaintext highlighter-rouge">box2</code> isn’t good enough.  It needs an IP address.  So, before the IP packet is made, it looks up the IP address for <code class="language-plaintext highlighter-rouge">box2</code>.  Normally, this would probably be done with DNS.  We are going to fake it with a fake hosts file.</li>
  <li>The IP layer now crafts a packet with the source of box1’s IP address and the destination IP of box2.</li>
  <li>The IP layer now finds the interface that this network packet needs to go out on.  Our simulation will only have one interface per server.</li>
  <li>Now we hit layer 2.  The IP is not enough, we need to put this network packet on the Ethernet network.  So, we have to make an <em>ethernet frame</em>.  The ethernet frame has a source and a destination just like IP does but it speaks in MAC addresses, hardware locations.</li>
  <li>The server goes through a process of resolving the MAC address with a protocol called ARP which asks the network who owns the IP for <code class="language-plaintext highlighter-rouge">box2 (192.168.0.2)</code>.  ARP replies with <code class="language-plaintext highlighter-rouge">21:22:23:24:25:26</code>.  Now, <code class="language-plaintext highlighter-rouge">box1</code> remembers this information in its ARP cache.</li>
  <li>Now <code class="language-plaintext highlighter-rouge">box1</code> sends the Ethernet packet out its interface to the switch on port 1.</li>
</ol>

<p>This is where the simulation implementation ends.  The switch in this example is a dumb layer 2 switch that hasn’t seen any traffic from anybody.  All it has seen is an Ethernet packet coming from port 1 with a destination of <code class="language-plaintext highlighter-rouge">21:22:23:24:25:26</code>.  It doesn’t know where <code class="language-plaintext highlighter-rouge">21:22:23:24:25:26</code> is.  So normally, it would <em>flood</em> all ports trying to find who that is.  For our example, we stop here.  If we implemented the switch some more, we would pass the ethernet frame into the interface attached to port 2.</p>

<p>There are many ways we could have modeled this but for now, this is enough.  Notice that we have run into some concepts here like <code class="language-plaintext highlighter-rouge">box</code>, <code class="language-plaintext highlighter-rouge">IP</code>, <code class="language-plaintext highlighter-rouge">switch</code>, <code class="language-plaintext highlighter-rouge">MAC</code>, <code class="language-plaintext highlighter-rouge">ARP</code> and some others.  These are some of the things that we will start modeling.</p>

<h3 id="our-first-model-the-mac-address">Our first model, the MAC address</h3>

<p>A MAC address is what was in the <code class="language-plaintext highlighter-rouge">ether</code> field from the ifconfig screenshot we saw earlier.  It’s a series of bytes in hex.  An ethernet device has one but also <em>dumb switches</em> use them for passing along packets.  We’ll talk about switches more in later posts but for now know that we have at least two things that need to know about MAC addresses.  Without explaning each one now, these are the things we will find out that we need to have MAC addresses for:</p>

<ol>
  <li>An ARP broadcast asks the local network if anyone knows about the ownership of an IP address</li>
  <li>A switch keeps a copy of MAC addresses it has seen in a table called a CAM table</li>
  <li>An ethernet interface has its MAC address usually burned onto a ROM</li>
</ol>

<p>So, this MAC address concept is coming up a lot.  We should model a <code class="language-plaintext highlighter-rouge">MacAddress</code> type.  We’ll also want to print this out in a friendly format for debugging so we will make a <code class="language-plaintext highlighter-rouge">.to_string()</code> method.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">type</span> <span class="n">MacAddress</span> <span class="o">=</span> <span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">6</span><span class="p">];</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="nf">to_string</span><span class="p">(</span><span class="n">mac</span><span class="p">:</span> <span class="o">&amp;</span><span class="p">[</span><span class="nb">u8</span><span class="p">;</span> <span class="mi">6</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nb">String</span> <span class="p">{</span>
    <span class="c1">// Convert each byte to a two-character hex string and join them with colons</span>
    <span class="n">mac</span><span class="nf">.iter</span><span class="p">()</span>
        <span class="nf">.map</span><span class="p">(|</span><span class="n">byte</span><span class="p">|</span> <span class="nd">format!</span><span class="p">(</span><span class="s">"{:02x}"</span><span class="p">,</span> <span class="n">byte</span><span class="p">))</span> <span class="c1">// Ensure two digits with padding if necessary, and lowercase hex</span>
        <span class="py">.collect</span><span class="p">::</span><span class="o">&lt;</span><span class="nb">Vec</span><span class="o">&lt;</span><span class="n">_</span><span class="o">&gt;&gt;</span><span class="p">()</span>
        <span class="nf">.join</span><span class="p">(</span><span class="s">":"</span><span class="p">)</span>
<span class="p">}</span>

<span class="k">pub</span> <span class="k">fn</span> <span class="nf">parse_mac_address</span><span class="p">(</span><span class="n">mac</span><span class="p">:</span> <span class="o">&amp;</span><span class="nb">str</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">MacAddress</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">parts</span><span class="p">:</span> <span class="nb">Vec</span><span class="o">&lt;&amp;</span><span class="nb">str</span><span class="o">&gt;</span> <span class="o">=</span> <span class="n">mac</span><span class="nf">.split</span><span class="p">(</span><span class="sc">':'</span><span class="p">)</span><span class="nf">.collect</span><span class="p">();</span>
    <span class="k">if</span> <span class="n">parts</span><span class="nf">.len</span><span class="p">()</span> <span class="o">!=</span> <span class="mi">6</span> <span class="p">{</span>
        <span class="nd">panic!</span><span class="p">(</span><span class="s">"Invalid MAC address format"</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="k">let</span> <span class="k">mut</span> <span class="n">mac_array</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0u8</span><span class="p">;</span> <span class="mi">6</span><span class="p">];</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="n">part</span><span class="p">)</span> <span class="k">in</span> <span class="n">parts</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.enumerate</span><span class="p">()</span> <span class="p">{</span>
        <span class="n">mac_array</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="nn">u8</span><span class="p">::</span><span class="nf">from_str_radix</span><span class="p">(</span><span class="n">part</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"Invalid hex value in MAC address"</span><span class="p">);</span>
    <span class="p">}</span>
    <span class="n">mac_array</span>
<span class="p">}</span>


<span class="nd">#[cfg(test)]</span>
<span class="k">mod</span> <span class="n">tests</span> <span class="p">{</span>
    <span class="k">use</span> <span class="k">super</span><span class="p">::</span><span class="o">*</span><span class="p">;</span>

    <span class="nd">#[test]</span>
    <span class="k">fn</span> <span class="nf">test_parse_mac_address_valid</span><span class="p">()</span> <span class="p">{</span>
        <span class="k">let</span> <span class="n">mac_str</span> <span class="o">=</span> <span class="s">"AA:BB:CC:DD:EE:FF"</span><span class="p">;</span>
        <span class="k">let</span> <span class="n">expected</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0xAA</span><span class="p">,</span> <span class="mi">0xBB</span><span class="p">,</span> <span class="mi">0xCC</span><span class="p">,</span> <span class="mi">0xDD</span><span class="p">,</span> <span class="mi">0xEE</span><span class="p">,</span> <span class="mi">0xFF</span><span class="p">];</span>
        <span class="nd">assert_eq!</span><span class="p">(</span><span class="nf">parse_mac_address</span><span class="p">(</span><span class="n">mac_str</span><span class="p">),</span> <span class="n">expected</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The tests section shows its usage.  It will be more useful and clear what this type is doing when using it with an interface or an ARP function.  But for now, that’s our first model.  The others will be similar.</p>

<p>In the next post, we’ll continue modeling out concepts.  The source code will not be completely explained and duplicated in following posts but <a href="https://github.com/squarism/layer_three">is available on Github</a>.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Let’s take the magic out of networking, shall we? First things first, let’s establish some ground rules: Our aim is to provide an abstraction level akin to that experienced by a Linux server or networking device user, not delving into the intricacies of network engineering. We’ll avoid nitty-gritty details of bits, electrons, and cables. After all, when people say they don’t grasp networking, they’re usually not referring to the physics behind it. Our model won’t operate within an event loop. While scripting or CLI tools are fantastic, they’re inherently limited by their sequential nature. Real-world networking operates asynchronously, akin to a bustling city where events occur in parallel. To replicate this, we would need to veer into text-mode game territory, and I don’t want to do this. We can learn about networking without making a long-running program but some things will need to be faked. Now that we’ve set some boundaries, let’s name some exciting topics ahead. We will explore bit-math, frames, packets, understand different abstraction layers, make a binary file that can actually be read by Wireshark and create a working local area network (LAN) using real specifications. This is a series of posts exploring networking by building things: Part 2 - implements primitives like IP, ARP and an Interface Part 3 - finishes the abstractions and shows the whole thing working For more in-depth background, here is a nice video series but I encourage you to replicate my work and build a network yourself in Rust or some other language. You’ll learn more by doing. Approximately How Networking Works Networking works in abstraction layers from the point of view of the OSI model. From the bottom up, it goes (1) Physical, (2) Data Link and then (3) Network layer. It continues to higher abstraction layers but our simulation and this blog post series will stop at layer 3. Imagine a server named box1 pings box2. user@box1 $ ping box2 The ICMP program ping is way up at layer 7 (not pictured below) and sends information down the left side of the picture below ICMP client -&gt; IP -&gt; Ethernet -&gt; Switch Port 1. Eventually it turns into bits, electricity that flows across the switch through cables (not pictured) and then hits box2’s stack where it flows back up right side. This flow up the right side of the picture is shown below as Switch Port 4 -&gt; Ethernet -&gt; IP -&gt; ICMP client. (The ICMP responder would really a part of the network stack on box2). So notice that the flow down the left side and the flow up the right side are opposite and reversed. We will talk about all the details of all of this. But here are key takeaways at this point: Details are hidden by the abstraction layers. The layers are different. The physical layer could be copper wires, wifi, pigeons or glass. The other layers don’t know or care. Ping works the same over wifi or wires. Each layer has its own concerns (non-leaky abstraction).]]></summary></entry><entry><title type="html">Microframeworks Are Too Small</title><link href="https://squarism.com/2023/12/20/microframeworks/" rel="alternate" type="text/html" title="Microframeworks Are Too Small" /><published>2023-12-20T00:00:00-08:00</published><updated>2023-12-20T00:00:00-08:00</updated><id>https://squarism.com/2023/12/20/microframeworks</id><content type="html" xml:base="https://squarism.com/2023/12/20/microframeworks/"><![CDATA[<p>I want to talk about what microframeworks don’t solve but also why there has been and continue to be so many of them.  First, I guess we should identify some terms. Identifying if a web framework is a microframework is debatable but I would call it a microframeworks if:</p>

<ul>
  <li>The author calls it a microframework</li>
  <li>It has a small hello world example that looks like Sinatra in the README or the homepage</li>
  <li>It doesn’t pre-solve a common problem for you</li>
</ul>

<p>So, Flask / Express / Sinatra are microframeworks as opposed to their sister projects Django / Next.js or others / Rails.  If the language ecosystem offers a pair like Django vs Flask then comparatively, Flask is the microframework by comparison and not just because <a href="https://en.wikipedia.org/wiki/Flask_(web_framework)">wikipedia also classifies it</a> this way.  It’s because it’s smaller than Django.  This is not obvious if a language does not have a bigger framework.</p>

<p>The “pre-solve a common problem for you” is the big one.  To make this more concrete, I will be using the use
case of adding a database to your project as an inflection point.  Larger frameworks usually have some kind of story around adding a database and the microframeworks do not.</p>

<h2 id="it-started-with-sinatra">It Started with Sinatra</h2>

<p><img alt="Old Sinatra Website" style="width: 50%; margin: auto;" src="/uploads/2023/sinatra.jpg" /></p>

<p>There are many microframeworks in every language but it started with <a href="https://sinatrarb.com/">Sinatra</a>.  Sinatra had this very cute and
small README snippet on their home page.  <em>“Put this in your pipe and smoke it”</em> was the original tagline.  The picture above is from The Wayback Machine.  It
was compelling and provocative at the time.  You wire up a route and you get some plain text
back.  The whole idea fits in a code snippet and this was hard to ignore.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'sinatra'</span>

<span class="n">get</span> <span class="s1">'/'</span> <span class="k">do</span>
  <span class="s1">'Hello world!'</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The trade-offs of Sinatra are not obvious in the above Hello World.  To me, the simplicity trade-off is inticing to juniors when they don’t know what the trade-off is.  The API surface is small which is easier to learn.  But confusing things start to happen after that.  If you start a Sinatra
project, you will realize that when you change your code, you have to <code class="language-plaintext highlighter-rouge">ctrl-c</code> to stop the dev server and start your app again with <code class="language-plaintext highlighter-rouge">ruby myapp.rb</code>.  Every code change, switch terminal, <code class="language-plaintext highlighter-rouge">ctrl-c</code>, <code class="language-plaintext highlighter-rouge">up-arrow</code>, <code class="language-plaintext highlighter-rouge">enter</code>.  That was just changing code, manual
testing.  Yes, there was a plugin to do dev reloading but so many other questions and concerns would appear after this most basic flow.  It doesn’t come with dev reloading and common use cases continue from there:</p>

<blockquote>
  <p>Where do I put passwords? <br />
How can I add a database? <br />
How do I build or deploy this thing?</p>
</blockquote>

<p>So, as the project grew or if you simply kept
working with it you would have to solve, research or enhance Sinatra yourself.  Many times, myself and others
would copy whole files out of Rails default projects.  Like, we’d generate a Rails project off to the side and steal <code class="language-plaintext highlighter-rouge">.rb</code> files from it.  Or, we’d end up stealing ideas from Rails.  Ideas like development server reloading, where to put configs, test fixtures or the concept of dev/test/prod.</p>

<p>But I think these DX nit-picks are not where the test is.  I think adding a database to a microframework is the inflection point where the lack of features makes things tedious and confusing.</p>

<h2 id="the-database-is-the-inflection-point">The Database is the Inflection Point</h2>

<p>I think configuring a database stresses the framework and most microframeworks fail here.  This isn’t quite <a href="https://squarism.com/2021/07/08/databases-ruin-all-good-ideas/">The Database Ruins All Good Ideas</a>, it’s more like, The Database Makes the Framework Creak.</p>

<p>Some teams running Flask might say “we have a database in Flask already, it’s easy” but what they really have is a bunch of hidden context. Take this example of how some work with Flask:</p>

<ol>
  <li>There’s a production database which a Flask app connects to.  This is the only database in the world for
this application.  There is no dev database, even on a laptop.</li>
  <li>The SQL to create this empty database was copied to something like Dropbox, email or a file share.  Or maybe it never was.  “Why would you need an empty database?”</li>
  <li>The database rarely changes but not by intent.  When a change needs to happen,
stress and confusion levels are high.  It might even be deemed impossible.</li>
  <li>The password for the database connection is in git and the repo is set to private.  This password is never rotated when team members leave.</li>
  <li>When you boot Flask to do development, Flask connects to the production database because that’s where the data is (“what else could we do?”).</li>
</ol>

<p>In some cases, the fact that this setup works at all is sometimes related to the circumstance that the application
is a read-only visualization or dashboard.  If the application had become read-write, this entire idea would
fall apart.  Or, maybe there’s a “database person” that essentially is a mutex for the team.  Or, maybe the app is so small in scope that this is fine.</p>

<p>So far, this hasn’t been my experience.  My experience has been to take ideas as needed from full frameworks and bring them into Flask.  These concepts are not obvious in Flask because Flask do not have these concepts in them.  Take dev/test/prod.  In order to add dev/test/prod, we need to:</p>

<ul>
  <li>Get the passwords out of git and introduce the dotenv library</li>
  <li>Add some configuration files, maybe with Dynaconf</li>
  <li>Add database migrations and seeding</li>
  <li>Have every dev have a local database</li>
  <li>Have some example data or factories or something</li>
</ul>

<p>If we get this far, a pull request could have code and schema changes proposed.  Without it, we’re unlikely to mess around with the database structure.</p>

<p>You might be quick to blame the team.  I’m quick to blame the tool but blame isn’t interesting here.  I think Flask influences thinking and trades-off too far in the direction of simplicity.  There is no dev/test/prod in Flask so I have to invent it by composing libraries
together on top of Flask.  Actually getting this to work with tests and <code class="language-plaintext highlighter-rouge">conftest.py</code> is non-trivial.  I’m not saying there is no trade-off either.  There is absolutely a trade-off.  Adding a dev database might confuse juniors.  We might have to introduce Docker or <a href="https://devenv.sh/">try to solve</a> development environments.  I might have just invented my own <a href="https://flask-unchained.readthedocs.io/en/latest/">flask-unchained</a> by selecting libraries.</p>

<p>For very simple applications, this simple setup might work fine.  I don’t think it works for long.  I think applications usually grow in complexity and features.  I think it is common to have Flask fall apart in your hands  I think teams in this situation have only been exposed to Flask.  My experience in these cases has been to introduce ideas from other frameworks to Flask-only teams.  My experience is that even small apps outgrow Flask because most apps have state in the database and the default experience (not just Flask) is awful.</p>

<h3 id="most-of-the-work-is-explanation">Most of the Work is Explanation</h3>

<p>For dev/test/prod and Flask, the change in thinking had to be the following.  There isn’t just one database.  There are many instances of this
database in different environments for different purposes.  This is a flawed “shared development server” style of thinking where the only database available is the production one.  It’s not A database, it’s THE database.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌────────────────┐
│ "THE" Database │
│                │
└────────────────┘
         ▲        
         │        
         │        
┌────────────────┐
│ Flask (laptop) │
│                │
└────────────────┘
</code></pre></div></div>

<p>Instead, thinking of it like as many instances in different contexts or environments this opens up many options.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌───────────────────┐  ┌───────────────────┐  ┌───────────────────┐
│ A Database (dev)  │  │ A Database (test) │  │ A Database (prod) │
│                   │  │                   │  │                   │
└───────────────────┘  └───────────────────┘  └───────────────────┘
          ▲                      ▲                      ▲          
          │                      │                      │          
          │                      │                      │          
┌──────────────────┐   ┌──────────────────┐   ┌──────────────────┐ 
│   Flask (dev)    │   │   Flask (test)   │   │   Flask (prod)   │ 
│                  │   │                  │   │                  │ 
└──────────────────┘   └──────────────────┘   └──────────────────┘
</code></pre></div></div>
<p>There are many databases and not just one.  There is no shared development server or shared state.  There is no shared development anything.  Test might be CI but test can
also be your laptop.  It is the dev -&gt; test -&gt; prod <a href="https://12factor.net/config">progression</a>.</p>

<p>Thinking of it this way solves problems like:</p>
<ul>
  <li>how can we rotate passwords or get them out of git?</li>
  <li>how can I change the database structure while it is running?</li>
  <li>how can we scale from 1 developer to 5?</li>
</ul>

<p>But this is not what Flask comes with in the Hello World example nor does Flask teach you by example what this
concept is.  In order to get there you need to add plugins, libraries and configuration.  Of course, full frameworks cannot solve all your problems (you have to code something).  But when you need to diverge from the framework is usually much later in the project’s life and a good framework will be solving common problems for you (better than you could have done).  It’s reuse.</p>

<h2 id="trade-offs">Trade Offs</h2>

<p>I think a lot of this is inevitable given an assumption of what class of application your application will
turn out to be.  Even with this viewpoint that I and others have, it was annoying when a greenfield would come
along and we didn’t want to include all of Rails but we also knew that we’d regret picking Sinatra later
because we’d have to at least handle things like passwords, ENVs, a database connection pool and development
quality of life things.  The trade-offs were known, which is why a microframework was being considered.
But, do we need all of Rails, all the time?  It’s annoying to have to have every concern included by default.  Maybe we
don’t have and won’t have an admin interface.  Maybe we will never send email.  But then we think through all the missing features and realize that we’d have to copy and steal ideas.  This is also what happens in FastAPI to <a href="https://fastapi.tiangolo.com/tutorial/sql-databases/">configure the database</a>.  Because FastAPI has “no database ideas” in it, you have to copy and paste configuration from documentation.  Ideally, I’d want to opt-out of features in a full-framework and have it be configurable.  Even with opt-out, the complexity is still there so I understand the draw to a small API surface area.</p>

<p>Flask was inspired by Sinatra.  It has the same trade-offs as Sinatra.  If you try to enhance Flask to have more concepts in
it then you end up with your own framework.  Look at
<a href="https://flask-unchained.readthedocs.io/en/latest/">flask-unchained</a>.  The features that flask-unchained are
the features that Flask does not have.  It comes with an opinion on databases, it has a structure for APIs
etc.  The issue with these projects is they are not authored by Flask Unchained.  So the trade-off is a
flexible plugin system counter open-source maintainance issues.  Many Flask plugins are not maintained.  The
same issue happens in other languages.  Full-featured frameworks usually control more of the stack.  So, when
they do a release, those things that they include or have written are bumped or fixed so they work in the new
release.  In theory, a plugin system sounds ideal because it has flexibility.  What I’m arguing is that CORS,
authentication and database state are extremely common and these things should be in the framework.</p>

<p>More than that, copying and pasting configs from FastAPI docs is one-way and subject to <a href="https://squarism.com/2019/06/04/sprinkle-time-on-that-thing/">bit-rot</a>.  Controlling configuration even in a full framework is extremely challenging but usually there can be step-to-step upgrade guides but this only works when you can name the version you are on.  If you are copying and pasting configuration and code, what version of FastAPI are you on?</p>

<h2 id="a-tour-of-small-readmes">A Tour of Small READMEs</h2>

<p>There are many microframeworks in many languages.  Most of them have a small hello world README just like <a href="https://sinatrarb.com/intro.html">Sinatra</a> did.</p>

<p><a href="https://github.com/labstack/echo">Echo in Go</a></p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span>

<span class="k">import</span> <span class="p">(</span>
  <span class="s">"github.com/labstack/echo/v4"</span>
  <span class="s">"github.com/labstack/echo/v4/middleware"</span>
  <span class="s">"net/http"</span>
<span class="p">)</span>

<span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="c">// Echo instance</span>
  <span class="n">e</span> <span class="o">:=</span> <span class="n">echo</span><span class="o">.</span><span class="n">New</span><span class="p">()</span>

  <span class="c">// Middleware</span>
  <span class="n">e</span><span class="o">.</span><span class="n">Use</span><span class="p">(</span><span class="n">middleware</span><span class="o">.</span><span class="n">Logger</span><span class="p">())</span>
  <span class="n">e</span><span class="o">.</span><span class="n">Use</span><span class="p">(</span><span class="n">middleware</span><span class="o">.</span><span class="n">Recover</span><span class="p">())</span>

  <span class="c">// Routes</span>
  <span class="n">e</span><span class="o">.</span><span class="n">GET</span><span class="p">(</span><span class="s">"/"</span><span class="p">,</span> <span class="n">hello</span><span class="p">)</span>

  <span class="c">// Start server</span>
  <span class="n">e</span><span class="o">.</span><span class="n">Logger</span><span class="o">.</span><span class="n">Fatal</span><span class="p">(</span><span class="n">e</span><span class="o">.</span><span class="n">Start</span><span class="p">(</span><span class="s">":1323"</span><span class="p">))</span>
<span class="p">}</span>

<span class="c">// Handler</span>
<span class="k">func</span> <span class="n">hello</span><span class="p">(</span><span class="n">c</span> <span class="n">echo</span><span class="o">.</span><span class="n">Context</span><span class="p">)</span> <span class="kt">error</span> <span class="p">{</span>
  <span class="k">return</span> <span class="n">c</span><span class="o">.</span><span class="n">String</span><span class="p">(</span><span class="n">http</span><span class="o">.</span><span class="n">StatusOK</span><span class="p">,</span> <span class="s">"Hello, World!"</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p><a href="https://kemalcr.com">Kemal in Crystal</a></p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s2">"kemal"</span>

<span class="c1"># Matches GET "http://host:port/"</span>
<span class="n">get</span> <span class="s2">"/"</span> <span class="k">do</span>
  <span class="s2">"Hello World!"</span>
<span class="k">end</span>

<span class="c1"># Creates a WebSocket handler.</span>
<span class="c1"># Matches "ws://host:port/socket"</span>
<span class="n">ws</span> <span class="s2">"/socket"</span> <span class="k">do</span> <span class="o">|</span><span class="n">socket</span><span class="o">|</span>
  <span class="n">socket</span><span class="p">.</span><span class="nf">send</span> <span class="s2">"Hello from Kemal!"</span>
<span class="k">end</span>

<span class="no">Kemal</span><span class="p">.</span><span class="nf">run</span>
</code></pre></div></div>

<p><a href="https://expressjs.com/en/starter/hello-world.html">Express in Node</a></p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">express</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">express</span><span class="dl">'</span><span class="p">)</span>
<span class="kd">const</span> <span class="nx">app</span> <span class="o">=</span> <span class="nx">express</span><span class="p">()</span>
<span class="kd">const</span> <span class="nx">port</span> <span class="o">=</span> <span class="mi">3000</span>

<span class="nx">app</span><span class="p">.</span><span class="kd">get</span><span class="p">(</span><span class="dl">'</span><span class="s1">/</span><span class="dl">'</span><span class="p">,</span> <span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">res</span><span class="p">)</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="nx">res</span><span class="p">.</span><span class="nx">send</span><span class="p">(</span><span class="dl">'</span><span class="s1">Hello World!</span><span class="dl">'</span><span class="p">)</span>
<span class="p">})</span>

<span class="nx">app</span><span class="p">.</span><span class="nx">listen</span><span class="p">(</span><span class="nx">port</span><span class="p">,</span> <span class="p">()</span> <span class="o">=&gt;</span> <span class="p">{</span>
  <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`Example app listening on port </span><span class="p">${</span><span class="nx">port</span><span class="p">}</span><span class="s2">`</span><span class="p">)</span>
<span class="p">})</span>
</code></pre></div></div>

<p>These examples are easily understood which is a <em>pro</em> in the complexity trade off.  What is not shown is the other side of the trade off.</p>

<h2 id="there-are-few-fully-featured-frameworks">There are Few Fully-Featured Frameworks</h2>

<p>There are very few fully-featured frameworks and there is usually only one or two in each language.</p>

<ul>
  <li>Python has Django</li>
  <li>Javascript has Next or Remix or RedwoodJS</li>
  <li>Java has Spring Boot</li>
  <li>PHP has Laravel</li>
  <li>Ruby has Rails</li>
  <li>C# has .NET</li>
  <li>Elixir has Phoenix</li>
</ul>

<p>But there are very few others.  It seems that language communities tend to rally around one or two full
frameworks and these frameworks last a long time.  The microframeworks might churn a lot more because there is less to throw away or because they are easy to invent?  Flask interest turns to FastAPI.  Sinatra interest turns to <a href="https://github.com/ruby-grape/grape">Grape</a>.  Express interest turns to Fastify.  But Django interest rarely turns into anything except a newer version of Django.  <a href="https://docs.masoniteproject.com/">Masonite</a> is a rare exception (I have not tried it at scale).  It’s much easier to create a hobby microframework project that is small in scope.</p>

<p>I think this is because full frameworks take 7 years to do right and this is assuming a productive or
high-level language.  The amount of work it takes to make a framework that is documented, tested and
feature-rich is huge.  Usually it seems to require a sponsored company or extraction out of a working business.  Django
was extracted out of a newspaper company.  Rails was extracted out of a team management company.  Next was
extracted or built in parallel out of a hosting company.  RedwoodJS is from <a href="https://github.com/mojombo">an exited Github
founder</a>.  I think it is interesting that this is the scale that
is required but it also might help us predict what is possible.  Can a hobbyist disrupt a full-framework?  Can
a single person invent a Django or Rails killer?</p>

<p>We can almost skip microframework attention because they will be the first to go.  I am not trying to FUD people out of
hobby projects.  Do it, make your thing.  But unless it has some very novel idea in it, then it’s unlikely to
stick.  We’ve <a href="https://www.techempower.com/benchmarks/#hw=ph&amp;test=fortune&amp;section=data-r22&amp;c=d">had hundreds of microframeworks
already</a>.  So if the real reason
for invention is “I did it because I could” then this isn’t good enough.  We need more “what we need” even if
it takes years.  This is my main point.  Full-featured frameworks are harder to make and harder to learn.  But they have ideas in them that you should probably know about.  Critically think about the Hello World README example in Flask / Sintra / Express.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I want to talk about what microframeworks don’t solve but also why there has been and continue to be so many of them. First, I guess we should identify some terms. Identifying if a web framework is a microframework is debatable but I would call it a microframeworks if: The author calls it a microframework It has a small hello world example that looks like Sinatra in the README or the homepage It doesn’t pre-solve a common problem for you So, Flask / Express / Sinatra are microframeworks as opposed to their sister projects Django / Next.js or others / Rails. If the language ecosystem offers a pair like Django vs Flask then comparatively, Flask is the microframework by comparison and not just because wikipedia also classifies it this way. It’s because it’s smaller than Django. This is not obvious if a language does not have a bigger framework. The “pre-solve a common problem for you” is the big one. To make this more concrete, I will be using the use case of adding a database to your project as an inflection point. Larger frameworks usually have some kind of story around adding a database and the microframeworks do not. It Started with Sinatra There are many microframeworks in every language but it started with Sinatra. Sinatra had this very cute and small README snippet on their home page. “Put this in your pipe and smoke it” was the original tagline. The picture above is from The Wayback Machine. It was compelling and provocative at the time. You wire up a route and you get some plain text back. The whole idea fits in a code snippet and this was hard to ignore. require 'sinatra' get '/' do 'Hello world!' end The trade-offs of Sinatra are not obvious in the above Hello World. To me, the simplicity trade-off is inticing to juniors when they don’t know what the trade-off is. The API surface is small which is easier to learn. But confusing things start to happen after that. If you start a Sinatra project, you will realize that when you change your code, you have to ctrl-c to stop the dev server and start your app again with ruby myapp.rb. Every code change, switch terminal, ctrl-c, up-arrow, enter. That was just changing code, manual testing. Yes, there was a plugin to do dev reloading but so many other questions and concerns would appear after this most basic flow. It doesn’t come with dev reloading and common use cases continue from there: Where do I put passwords? How can I add a database? How do I build or deploy this thing? So, as the project grew or if you simply kept working with it you would have to solve, research or enhance Sinatra yourself. Many times, myself and others would copy whole files out of Rails default projects. Like, we’d generate a Rails project off to the side and steal .rb files from it. Or, we’d end up stealing ideas from Rails. Ideas like development server reloading, where to put configs, test fixtures or the concept of dev/test/prod. But I think these DX nit-picks are not where the test is. I think adding a database to a microframework is the inflection point where the lack of features makes things tedious and confusing. The Database is the Inflection Point I think configuring a database stresses the framework and most microframeworks fail here. This isn’t quite The Database Ruins All Good Ideas, it’s more like, The Database Makes the Framework Creak. Some teams running Flask might say “we have a database in Flask already, it’s easy” but what they really have is a bunch of hidden context. Take this example of how some work with Flask: There’s a production database which a Flask app connects to. This is the only database in the world for this application. There is no dev database, even on a laptop. The SQL to create this empty database was copied to something like Dropbox, email or a file share. Or maybe it never was. “Why would you need an empty database?” The database rarely changes but not by intent. When a change needs to happen, stress and confusion levels are high. It might even be deemed impossible. The password for the database connection is in git and the repo is set to private. This password is never rotated when team members leave. When you boot Flask to do development, Flask connects to the production database because that’s where the data is (“what else could we do?”). In some cases, the fact that this setup works at all is sometimes related to the circumstance that the application is a read-only visualization or dashboard. If the application had become read-write, this entire idea would fall apart. Or, maybe there’s a “database person” that essentially is a mutex for the team. Or, maybe the app is so small in scope that this is fine. So far, this hasn’t been my experience. My experience has been to take ideas as needed from full frameworks and bring them into Flask. These concepts are not obvious in Flask because Flask do not have these concepts in them. Take dev/test/prod. In order to add dev/test/prod, we need to: Get the passwords out of git and introduce the dotenv library Add some configuration files, maybe with Dynaconf Add database migrations and seeding Have every dev have a local database Have some example data or factories or something If we get this far, a pull request could have code and schema changes proposed. Without it, we’re unlikely to mess around with the database structure. You might be quick to blame the team. I’m quick to blame the tool but blame isn’t interesting here. I think Flask influences thinking and trades-off too far in the direction of simplicity. There is no dev/test/prod in Flask so I have to invent it by composing libraries together on top of Flask. Actually getting this to work with tests and conftest.py is non-trivial. I’m not saying there is no trade-off either. There is absolutely a trade-off. Adding a dev database might confuse juniors. We might have to introduce Docker or try to solve development environments. I might have just invented my own flask-unchained by selecting libraries. For very simple applications, this simple setup might work fine. I don’t think it works for long. I think applications usually grow in complexity and features. I think it is common to have Flask fall apart in your hands I think teams in this situation have only been exposed to Flask. My experience in these cases has been to introduce ideas from other frameworks to Flask-only teams. My experience is that even small apps outgrow Flask because most apps have state in the database and the default experience (not just Flask) is awful. Most of the Work is Explanation For dev/test/prod and Flask, the change in thinking had to be the following. There isn’t just one database. There are many instances of this database in different environments for different purposes. This is a flawed “shared development server” style of thinking where the only database available is the production one. It’s not A database, it’s THE database. ┌────────────────┐ │ "THE" Database │ │ │ └────────────────┘ ▲ │ │ ┌────────────────┐ │ Flask (laptop) │ │ │ └────────────────┘ Instead, thinking of it like as many instances in different contexts or environments this opens up many options. ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │ A Database (dev) │ │ A Database (test) │ │ A Database (prod) │ │ │ │ │ │ │ └───────────────────┘ └───────────────────┘ └───────────────────┘ ▲ ▲ ▲ │ │ │ │ │ │ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ Flask (dev) │ │ Flask (test) │ │ Flask (prod) │ │ │ │ │ │ │ └──────────────────┘ └──────────────────┘ └──────────────────┘ There are many databases and not just one. There is no shared development server or shared state. There is no shared development anything. Test might be CI but test can also be your laptop. It is the dev -&gt; test -&gt; prod progression. Thinking of it this way solves problems like: how can we rotate passwords or get them out of git? how can I change the database structure while it is running? how can we scale from 1 developer to 5? But this is not what Flask comes with in the Hello World example nor does Flask teach you by example what this concept is. In order to get there you need to add plugins, libraries and configuration. Of course, full frameworks cannot solve all your problems (you have to code something). But when you need to diverge from the framework is usually much later in the project’s life and a good framework will be solving common problems for you (better than you could have done). It’s reuse. Trade Offs I think a lot of this is inevitable given an assumption of what class of application your application will turn out to be. Even with this viewpoint that I and others have, it was annoying when a greenfield would come along and we didn’t want to include all of Rails but we also knew that we’d regret picking Sinatra later because we’d have to at least handle things like passwords, ENVs, a database connection pool and development quality of life things. The trade-offs were known, which is why a microframework was being considered. But, do we need all of Rails, all the time? It’s annoying to have to have every concern included by default. Maybe we don’t have and won’t have an admin interface. Maybe we will never send email. But then we think through all the missing features and realize that we’d have to copy and steal ideas. This is also what happens in FastAPI to configure the database. Because FastAPI has “no database ideas” in it, you have to copy and paste configuration from documentation. Ideally, I’d want to opt-out of features in a full-framework and have it be configurable. Even with opt-out, the complexity is still there so I understand the draw to a small API surface area. Flask was inspired by Sinatra. It has the same trade-offs as Sinatra. If you try to enhance Flask to have more concepts in it then you end up with your own framework. Look at flask-unchained. The features that flask-unchained are the features that Flask does not have. It comes with an opinion on databases, it has a structure for APIs etc. The issue with these projects is they are not authored by Flask Unchained. So the trade-off is a flexible plugin system counter open-source maintainance issues. Many Flask plugins are not maintained. The same issue happens in other languages. Full-featured frameworks usually control more of the stack. So, when they do a release, those things that they include or have written are bumped or fixed so they work in the new release. In theory, a plugin system sounds ideal because it has flexibility. What I’m arguing is that CORS, authentication and database state are extremely common and these things should be in the framework. More than that, copying and pasting configs from FastAPI docs is one-way and subject to bit-rot. Controlling configuration even in a full framework is extremely challenging but usually there can be step-to-step upgrade guides but this only works when you can name the version you are on. If you are copying and pasting configuration and code, what version of FastAPI are you on? A Tour of Small READMEs There are many microframeworks in many languages. Most of them have a small hello world README just like Sinatra did. Echo in Go package main import ( "github.com/labstack/echo/v4" "github.com/labstack/echo/v4/middleware" "net/http" ) func main() { // Echo instance e := echo.New() // Middleware e.Use(middleware.Logger()) e.Use(middleware.Recover()) // Routes e.GET("/", hello) // Start server e.Logger.Fatal(e.Start(":1323")) } // Handler func hello(c echo.Context) error { return c.String(http.StatusOK, "Hello, World!") } Kemal in Crystal require "kemal" # Matches GET "http://host:port/" get "/" do "Hello World!" end # Creates a WebSocket handler. # Matches "ws://host:port/socket" ws "/socket" do |socket| socket.send "Hello from Kemal!" end Kemal.run Express in Node const express = require('express') const app = express() const port = 3000 app.get('/', (req, res) =&gt; { res.send('Hello World!') }) app.listen(port, () =&gt; { console.log(`Example app listening on port ${port}`) }) These examples are easily understood which is a pro in the complexity trade off. What is not shown is the other side of the trade off. There are Few Fully-Featured Frameworks There are very few fully-featured frameworks and there is usually only one or two in each language. Python has Django Javascript has Next or Remix or RedwoodJS Java has Spring Boot PHP has Laravel Ruby has Rails C# has .NET Elixir has Phoenix But there are very few others. It seems that language communities tend to rally around one or two full frameworks and these frameworks last a long time. The microframeworks might churn a lot more because there is less to throw away or because they are easy to invent? Flask interest turns to FastAPI. Sinatra interest turns to Grape. Express interest turns to Fastify. But Django interest rarely turns into anything except a newer version of Django. Masonite is a rare exception (I have not tried it at scale). It’s much easier to create a hobby microframework project that is small in scope. I think this is because full frameworks take 7 years to do right and this is assuming a productive or high-level language. The amount of work it takes to make a framework that is documented, tested and feature-rich is huge. Usually it seems to require a sponsored company or extraction out of a working business. Django was extracted out of a newspaper company. Rails was extracted out of a team management company. Next was extracted or built in parallel out of a hosting company. RedwoodJS is from an exited Github founder. I think it is interesting that this is the scale that is required but it also might help us predict what is possible. Can a hobbyist disrupt a full-framework? Can a single person invent a Django or Rails killer? We can almost skip microframework attention because they will be the first to go. I am not trying to FUD people out of hobby projects. Do it, make your thing. But unless it has some very novel idea in it, then it’s unlikely to stick. We’ve had hundreds of microframeworks already. So if the real reason for invention is “I did it because I could” then this isn’t good enough. We need more “what we need” even if it takes years. This is my main point. Full-featured frameworks are harder to make and harder to learn. But they have ideas in them that you should probably know about. Critically think about the Hello World README example in Flask / Sintra / Express.]]></summary></entry><entry><title type="html">Ruby Has Long Known About Poetry</title><link href="https://squarism.com/2023/10/24/ruby-has-long-known-about-poetry/" rel="alternate" type="text/html" title="Ruby Has Long Known About Poetry" /><published>2023-10-24T00:00:00-07:00</published><updated>2023-10-24T00:00:00-07:00</updated><id>https://squarism.com/2023/10/24/ruby-has-long-known-about-poetry</id><content type="html" xml:base="https://squarism.com/2023/10/24/ruby-has-long-known-about-poetry/"><![CDATA[<p><img alt="Bundler and Poetry timeline" style="width: 100%; margin: auto;" src="/uploads/2023/bundle_poetry.png" /></p>

<p>Ruby has known about the Python package manager <a href="https://python-poetry.org/">Poetry</a> for a long time as <a href="https://bundler.io/">Bundler</a>.  This is not a smug victory lap.  Bundler was not always <em>the thing to use</em>.  If the sands of time are going to bury Ruby so that the language is <a href="https://sloboda-studio.com/blog/is-ruby-on-rails-dying/">deleted</a> from all multiverse timelines, then we might as well listen to its dying words:</p>

<p>    <em>“The Poetry transition is the Bundler transition, we already went through this.  Ack.”</em> 💀</p>

<p>Bundler is the de-facto standard on how to install a library in a Ruby project and has been for quite a while.  But there was a time before Bundler existed, when we had library needs but we did not have Bundler.  In that pre-Bundler time, some of us would use something like RVM (Ruby Version Manager) gemsets.  There were other tools but I am going to talk about RVM, I will explain it later in this post in case you don’t know RVM.</p>

<p>Currently, as of writing, there is <a href="https://blog.viraptor.info/post/python-dependency-management-difficulty-is-an-unhelpful-meme">much debate</a> about Python packaging.  We are in the throws of AI hype.  Python is being learned, in-use and popular.  But those that are learning it are learning <code class="language-plaintext highlighter-rouge">pip install</code>.  Or AI projects are <a href="https://ml-ops.org/">not yet feeling operation concerns</a> like repeatable builds.</p>

<p>I’m aware of prior-art posts:</p>

<ul>
  <li><a href="https://www.bitecode.dev/p/why-not-tell-people-to-simply-use">Why not tell people to “simply” use pyenv, poetry or anaconda</a></li>
  <li><a href="https://blog.viraptor.info/post/python-dependency-management-difficulty-is-an-unhelpful-meme">Python dependency management difficulty is an unhelpful meme</a></li>
</ul>

<p>These are good posts.  I’m adding orthogonally to them.  I want to talk about other prior-art.  Bundler and RVM gemsets.</p>

<h2 id="translation">Translation</h2>

<p>First, let’s translate a little bit.  Ruby gems are like Python packages.  Python has <code class="language-plaintext highlighter-rouge">.whl</code> and Ruby has <code class="language-plaintext highlighter-rouge">.gem</code>.  Ruby has only had gems but Python has had many package formats over the years, so I will just say python package.</p>

<p>You can install packages yourself, sort of “raw” using <code class="language-plaintext highlighter-rouge">pip install</code> in Python and <code class="language-plaintext highlighter-rouge">gem install</code> in Ruby.  Some people do this in Python but rarely do you do this in Ruby.  There are exceptions like global installs or generators but basically you rarely would do the equivalent of <code class="language-plaintext highlighter-rouge">pip install</code> in Ruby.</p>

<p>Both sites have a package website.  Python’s is <a href="https://pypi.org/">pypi.org</a> and Ruby’s is <a href="https://rubygems.org/">rubygems.org</a>.</p>

<table>
  <thead>
    <tr>
      <th>Python Terminology</th>
      <th>Ruby Terminology</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>python package</td>
      <td>ruby gem</td>
    </tr>
    <tr>
      <td>pip install</td>
      <td>gem install</td>
    </tr>
    <tr>
      <td>pypi.org</td>
      <td>rubygems.org</td>
    </tr>
  </tbody>
</table>

<h2 id="how-rvm-gemsets-worked">How RVM Gemsets Worked</h2>

<p>Before Bundler was even invented, I was using RVM gemsets to keep my Ruby projects separated.  RVM is a Ruby Version Manager (RVM) that would install and manage Ruby versions.  But it also had this extra feature on it called <a href="https://rvm.io/gemsets">gemsets</a>.  With gemsets, you could create a set of packages named after your project.  You could even have RVM switch to the gemset when you <code class="language-plaintext highlighter-rouge">cd</code>‘d into your project directory.  So this would be like if you <code class="language-plaintext highlighter-rouge">venv</code> activated automatically.</p>

<table>
  <thead>
    <tr>
      <th>Python Terminology</th>
      <th>Ruby Terminology</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>python -m venv venv</td>
      <td>rvm gemset create my-project</td>
    </tr>
    <tr>
      <td>. venv/bin/activate</td>
      <td>rvm gemset use my-project</td>
    </tr>
  </tbody>
</table>

<p>You would use RVM gemsets because if you didn’t, your Ruby install would fill up with packages, even the same one many times with different versions.  Then your project would not know which to use, not to mention you wouldn’t know which packages you were using.  So, if someone asked you this question:</p>

<blockquote>
  <p>Hey, are we using any GPLv2 stuff?</p>
</blockquote>

<p>You’d have no idea.  The same happens with raw <code class="language-plaintext highlighter-rouge">pip</code>.</p>

<p>When Bundler came out, <a href="https://wiki.c2.com/?BlubParadox">I rejected it</a> and kept using rvm gemsets.  But there was a major difference.  Bundler solved the dependency tree while doing isolation.  This was something I didn’t understand.  When someone told me that my gemset was just going become polluted, I said this was no big deal.  I would just delete my gemset and then <code class="language-plaintext highlighter-rouge">gem install</code> all the libraries I needed.  Maybe I could make a list of gems I needed in a text file in the project root.</p>

<p>I never tried Bundler and I didn’t know what problem it was solving.</p>

<h2 id="when-i-changed-my-mind">When I Changed My Mind</h2>

<p>So, I had my own devised system, mostly out of habit.  The workflow was very similar to pip with <code class="language-plaintext highlighter-rouge">requirements.txt</code>.  Except back then, we used <code class="language-plaintext highlighter-rouge">README.md</code> as a list of dependencies.  It was pretty terrible for repeatability but also just developer experience in general.</p>

<p>This is how classic pip with <code class="language-plaintext highlighter-rouge">requirements.txt</code> would work too.  <code class="language-plaintext highlighter-rouge">pip install -r requirements.txt</code> is append-only to your environment.  There’s no cleanup function.  <code class="language-plaintext highlighter-rouge">pip list</code> shows you all your downstream dependencies but <code class="language-plaintext highlighter-rouge">requirements.txt</code> only shows you what you want.  If you remove a top-level dependency, it’s hard to cleanup the transitive dependencies.  So, people delete their virtualenvs just like I did with gemsets.</p>

<p>Bundler came on the scene and I rejected it.  “I have gemsets, why do I need Bundler?”  But Bundler was doing something more than just project separation.  Bundler was giving me higher level commands, semver and a lockfile I did not have to serialize to a README or a text file.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># usage of bundler
bundle init
bundle add some-library
git add Gemfile Gemfile.lock

# a user of my code
bundle install
</code></pre></div></div>

<p>This is compared to</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rvm gemset create my-project
gem install some-library
# update README.md with some-library "hey, this project needs some-library"

# a user of my code
gem install some-library
</code></pre></div></div>

<p>The only caveat to bundler is that I need to prefix all commands with <code class="language-plaintext highlighter-rouge">bundle exec</code> because bundler did not use shell tricks to change ENVs or paths.  So if I wanted to see all the gems I have listed: <code class="language-plaintext highlighter-rouge">bundle exec gem list</code>.  I mitigate this by using an alias for <code class="language-plaintext highlighter-rouge">be=bundle exec</code>.</p>

<h2 id="the-equivalent-with-poetry">The Equivalent with Poetry</h2>

<p>The amount of commands with Bundler and Poetry is about the same.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>poetry init -n
poetry add some-library
git add pyproject.toml poetry.lock

# a user of my code
poetry install
</code></pre></div></div>

<p>The only caveat with poetry is that I need to prefix all commands with <code class="language-plaintext highlighter-rouge">poetry run</code> so <code class="language-plaintext highlighter-rouge">poetry run pytest</code>.  I mitigate this by using an alias for <code class="language-plaintext highlighter-rouge">pr=poetry run</code>.</p>

<h2 id="ruby-has-known-about-poetry">Ruby Has Known About Poetry</h2>

<p>So, Ruby already went through the poetry transition.  We learned a few things:</p>

<ul>
  <li>Don’t raw install libraies with gem install.</li>
  <li>Deeply solve your dependencies as a tree (transitive deps).</li>
  <li>Prefixes on commands are annoying but sub-shell or shell tricks are worse.</li>
  <li>Lock files are good.</li>
  <li>Conventions are good.</li>
</ul>

<p>This would translate to Python as something like this:</p>

<ul>
  <li>Don’t raw install libraries with <code class="language-plaintext highlighter-rouge">pip install</code>.</li>
  <li>Deeply solve your tree with pip 23.1+.  But, is it a good resolver?  🤷🏻‍♂️</li>
  <li>Get used to <code class="language-plaintext highlighter-rouge">poetry run</code>, alias it if you have to.</li>
  <li>Lock files are good, poetry comes with one.  You don’t need to use <code class="language-plaintext highlighter-rouge">pip freeze</code> or piptools or addons.</li>
  <li>Conventions are good.  The world will not use your homegrown system.</li>
</ul>

<p>After I started using Bundler, I never went back to even another style.  When I tried Go for 4 years, I used <code class="language-plaintext highlighter-rouge">gb</code> and other tools until <code class="language-plaintext highlighter-rouge">go.mod</code> was finalized.  It was similar with Python.  I searched for a Bundler-like tool and found Pipenv.  Pipenv’s resolver failed me on a project and Poetry did not.  I switched to Poetry.  When I started with Rust, Cargo was very familiar because <a href="https://github.com/rust-lang/cargo/commits?author=wycats">the people</a> that worked on Cargo <a href="https://github.com/rust-lang/cargo/commits?author=carllerche">came from</a> the Ruby community.</p>

<p>Lately, pip has been changing over to a <a href="https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/">pyproject.toml</a> format which has a higher level of abstraction.  I’m glad.  It seems like pip is becoming more like Poetry.  That’s fine, let the ideas be shared.  I would use a Bundler-like tool, no matter the name.  You could even say I would use a Cargo-like tool.  The trickiest part of Poetry has been convincing people to try it and I went through the same thing with gemsets.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Ruby has known about the Python package manager Poetry for a long time as Bundler. This is not a smug victory lap. Bundler was not always the thing to use. If the sands of time are going to bury Ruby so that the language is deleted from all multiverse timelines, then we might as well listen to its dying words:    “The Poetry transition is the Bundler transition, we already went through this. Ack.” 💀 Bundler is the de-facto standard on how to install a library in a Ruby project and has been for quite a while. But there was a time before Bundler existed, when we had library needs but we did not have Bundler. In that pre-Bundler time, some of us would use something like RVM (Ruby Version Manager) gemsets. There were other tools but I am going to talk about RVM, I will explain it later in this post in case you don’t know RVM. Currently, as of writing, there is much debate about Python packaging. We are in the throws of AI hype. Python is being learned, in-use and popular. But those that are learning it are learning pip install. Or AI projects are not yet feeling operation concerns like repeatable builds. I’m aware of prior-art posts: Why not tell people to “simply” use pyenv, poetry or anaconda Python dependency management difficulty is an unhelpful meme These are good posts. I’m adding orthogonally to them. I want to talk about other prior-art. Bundler and RVM gemsets. Translation First, let’s translate a little bit. Ruby gems are like Python packages. Python has .whl and Ruby has .gem. Ruby has only had gems but Python has had many package formats over the years, so I will just say python package. You can install packages yourself, sort of “raw” using pip install in Python and gem install in Ruby. Some people do this in Python but rarely do you do this in Ruby. There are exceptions like global installs or generators but basically you rarely would do the equivalent of pip install in Ruby. Both sites have a package website. Python’s is pypi.org and Ruby’s is rubygems.org. Python Terminology Ruby Terminology python package ruby gem pip install gem install pypi.org rubygems.org How RVM Gemsets Worked Before Bundler was even invented, I was using RVM gemsets to keep my Ruby projects separated. RVM is a Ruby Version Manager (RVM) that would install and manage Ruby versions. But it also had this extra feature on it called gemsets. With gemsets, you could create a set of packages named after your project. You could even have RVM switch to the gemset when you cd‘d into your project directory. So this would be like if you venv activated automatically. Python Terminology Ruby Terminology python -m venv venv rvm gemset create my-project . venv/bin/activate rvm gemset use my-project You would use RVM gemsets because if you didn’t, your Ruby install would fill up with packages, even the same one many times with different versions. Then your project would not know which to use, not to mention you wouldn’t know which packages you were using. So, if someone asked you this question: Hey, are we using any GPLv2 stuff? You’d have no idea. The same happens with raw pip. When Bundler came out, I rejected it and kept using rvm gemsets. But there was a major difference. Bundler solved the dependency tree while doing isolation. This was something I didn’t understand. When someone told me that my gemset was just going become polluted, I said this was no big deal. I would just delete my gemset and then gem install all the libraries I needed. Maybe I could make a list of gems I needed in a text file in the project root. I never tried Bundler and I didn’t know what problem it was solving. When I Changed My Mind So, I had my own devised system, mostly out of habit. The workflow was very similar to pip with requirements.txt. Except back then, we used README.md as a list of dependencies. It was pretty terrible for repeatability but also just developer experience in general. This is how classic pip with requirements.txt would work too. pip install -r requirements.txt is append-only to your environment. There’s no cleanup function. pip list shows you all your downstream dependencies but requirements.txt only shows you what you want. If you remove a top-level dependency, it’s hard to cleanup the transitive dependencies. So, people delete their virtualenvs just like I did with gemsets. Bundler came on the scene and I rejected it. “I have gemsets, why do I need Bundler?” But Bundler was doing something more than just project separation. Bundler was giving me higher level commands, semver and a lockfile I did not have to serialize to a README or a text file. # usage of bundler bundle init bundle add some-library git add Gemfile Gemfile.lock # a user of my code bundle install This is compared to rvm gemset create my-project gem install some-library # update README.md with some-library "hey, this project needs some-library" # a user of my code gem install some-library The only caveat to bundler is that I need to prefix all commands with bundle exec because bundler did not use shell tricks to change ENVs or paths. So if I wanted to see all the gems I have listed: bundle exec gem list. I mitigate this by using an alias for be=bundle exec. The Equivalent with Poetry The amount of commands with Bundler and Poetry is about the same. poetry init -n poetry add some-library git add pyproject.toml poetry.lock # a user of my code poetry install The only caveat with poetry is that I need to prefix all commands with poetry run so poetry run pytest. I mitigate this by using an alias for pr=poetry run. Ruby Has Known About Poetry So, Ruby already went through the poetry transition. We learned a few things: Don’t raw install libraies with gem install. Deeply solve your dependencies as a tree (transitive deps). Prefixes on commands are annoying but sub-shell or shell tricks are worse. Lock files are good. Conventions are good. This would translate to Python as something like this: Don’t raw install libraries with pip install. Deeply solve your tree with pip 23.1+. But, is it a good resolver? 🤷🏻‍♂️ Get used to poetry run, alias it if you have to. Lock files are good, poetry comes with one. You don’t need to use pip freeze or piptools or addons. Conventions are good. The world will not use your homegrown system. After I started using Bundler, I never went back to even another style. When I tried Go for 4 years, I used gb and other tools until go.mod was finalized. It was similar with Python. I searched for a Bundler-like tool and found Pipenv. Pipenv’s resolver failed me on a project and Poetry did not. I switched to Poetry. When I started with Rust, Cargo was very familiar because the people that worked on Cargo came from the Ruby community. Lately, pip has been changing over to a pyproject.toml format which has a higher level of abstraction. I’m glad. It seems like pip is becoming more like Poetry. That’s fine, let the ideas be shared. I would use a Bundler-like tool, no matter the name. You could even say I would use a Cargo-like tool. The trickiest part of Poetry has been convincing people to try it and I went through the same thing with gemsets.]]></summary></entry><entry><title type="html">How To Test Random Things</title><link href="https://squarism.com/2023/08/28/how-to-test-random-things/" rel="alternate" type="text/html" title="How To Test Random Things" /><published>2023-08-28T00:00:00-07:00</published><updated>2023-08-28T00:00:00-07:00</updated><id>https://squarism.com/2023/08/28/how-to-test-random-things</id><content type="html" xml:base="https://squarism.com/2023/08/28/how-to-test-random-things/"><![CDATA[<p>Let’s say we had a program that interacts with something that is random.  This could be a pseudorandom number generator (PRNG) we can sort of control or it could be something out of our control.  Let’s step through three examples, where the last one is random but wildly different.</p>

<p>Our program depends on this function.  How do we know our program works?</p>

<h2 id="mock-the-randomness">Mock the Randomness</h2>

<p>Our program is just consuming the random function.  If we print out the random number, we just need a number.  We don’t care if the number is random or not.  So the answer is simple: just avoid the randomness of the function.</p>

<p>It’s <a href="https://xkcd.com/221/">the XKCD comic 221</a>.</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">getRandomNumber</span><span class="o">()</span>
<span class="o">{</span>
  <span class="k">return</span> <span class="mi">4</span><span class="o">;</span>  <span class="c1">// chosen by fair dice roll.</span>
             <span class="c1">// guaranteed to be random.</span>
<span class="o">}</span>
</code></pre></div></div>

<p>So, we could basically do the same by mocking out the random function to return <code class="language-plaintext highlighter-rouge">4</code>.  Problem avoided.</p>

<h2 id="leverage-the-seed">Leverage the Seed</h2>

<p>If we need to functionally test what happens in different cases, maybe mocking the random function isn’t good enough.  What if the number <code class="language-plaintext highlighter-rouge">9</code> crashes our program for some reason.  We want <code class="language-plaintext highlighter-rouge">9</code> to be returned by the random number generator but also in a more complicated program (say a map generator), we might want to reproduce the state exactly for when <code class="language-plaintext highlighter-rouge">9</code> happens.</p>

<p>This is pretty easy to deal with, we just pass <a href="https://en.wikipedia.org/wiki/Random_seed">a seed</a> in.  Many things use this technique:</p>

<ul>
  <li>A test suite’s order in <a href="https://pypi.org/project/pytest-random-order/">pytest-random-order</a> and <a href="https://rubydoc.info/gems/rspec-core/RSpec%2FCore%2FConfiguration:seed">rspec</a>.</li>
  <li>The minecraft <code class="language-plaintext highlighter-rouge">/seed</code> command so your friends can play the map you have.</li>
  <li>An avatar generator might have a seed as well as other inputs.</li>
</ul>

<p>Using the seed, you can replay or inspect the randomness as a way to make the behavior deterministic.</p>

<p>Not everything can be made deterministic in a sort of clean room setting.  The next example is very different.</p>

<h2 id="ai-models">AI Models</h2>

<p>Let’s take a sharp turn and assume that the context you are in is AI, Machine Learning (ML) or LLMs.  Suddenly we are encountering a very different kind of randomness.  These models are not deterministic and they don’t have a seed.  We can test our program around the randomness with mocking, perhaps.  But what about the model itself?  What if we need to know that <em>it</em> works?</p>

<p>So, what are we talking about here?  How do we know our model works?  How do you test a model?  How do you compare two models?  How do you figure out if you have improved your model when you release a new version?  How do we know if ChatGPT 4 is better than 3.5 beyond anecdotes?  How does OpenAI determine that ChatGPT 4 is so good that it should be a paid upgrade over 3.5?</p>

<p>The answer is evaluation but I want to break down what an evaluation is a bit.  I have to caveat that my experience with ML evaluations is mostly surface level from proximity to research, PHDs, etc.  I have never designed an evaluation, so my numbers here might be very rough.  I’ll also caveat that my specific examples (like Llama 2) are locked in time to the publish date of this post.</p>

<p>Let’s say that we wanted to see if <a href="https://ai.meta.com/llama/">Llama 2</a> is worse at generating Rust code than <a href="https://about.fb.com/news/2023/08/code-llama-ai-for-coding/">Llama 2 Code</a> is.  Seems reasonsable, right?  Llama Code was made for coding and Llama was not.  So, how do we really know?  The models are not deterministic.  We can ask it the same question and we will get different answers.  We can’t mock anything because we don’t want to test the code <em>around</em> the model, we want to test the model.  There is no seed or anything to make it deterministic.  And, we can’t inspect the model because <a href="https://medium.com/gsi-technology/making-large-language-models-more-explainable-b8215696a659">these models are not</a> really <a href="https://en.wikipedia.org/wiki/Explainable_artificial_intelligence">explainable</a> (or at least easily).</p>

<p>So, our previous approaches don’t work.  We need a new approach.  The new approach will be to throw experiments at it and measure our results using some methodology.  There are many methodologies but for the sake of brevity, we’ll use the classic <a href="https://en.wikipedia.org/wiki/Precision_and_recall">F1 score</a>.  I’m not going to go into how that works but how the evaluation would roughly be run.</p>

<p>The rough steps are going to be:</p>

<ol>
  <li>Generate or obtain ground truth questions and answers and call it the “Test Data”.</li>
  <li>Keep the answers secret from the model.</li>
  <li>Give the Llama 2 model each question from the Test Data and note its answer.</li>
  <li>Do the same for the Llama 2 Code model.</li>
  <li>Score the results.</li>
</ol>

<p>These steps are easier said than done.  Let’s try to break this down even more.  What is involved in step 1?</p>

<h3 id="step-1---generate-rust-questions">Step 1 - Generate Rust Questions</h3>

<p>One question will be this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Generate hello world in Rust.
</code></pre></div></div>

<p>And the answer we expect is this:</p>
<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="nd">println!</span><span class="p">(</span><span class="s">"Hello, World!"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We (as humans) all agree that this is the result that we expect given the question/prompt.  No more context is allowed.  This is one question and answer pair in our “Test Data” set.  We might aim for 200 or 1,000 such questions.  We might aim for a diverse set of questions, things way beyond hello world.  As we select what we care about, we add bias to our evaluation.  But bias is a different topic.</p>

<p>Step 2 is simply file organization, really.  This is more important if we were fine-tuning Llama or making a model ourselves.  But since Llama 2 models are already done, we can just move on.</p>

<p>As a side note, sometimes generating ground truth and test data can take a team of many people, many months to create.</p>

<h3 id="step-3-and-4---runs">Step 3 and 4 - Runs</h3>

<p>For each question, run the model with the question.  If we have Llama running somewhere, hit the API and record the result.  We’ll get to the hard problem in Step 5.  Repeat the calls for the other model.  Organize the results.  Maybe you have something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>results/llama_2/0001.json
results/llama_2/0002.json
...
results/llama_2_code/0001.json
results/llama_2_code/0002.json
...
</code></pre></div></div>

<p>The results files might be API JSON responses or a file format we’ve invented with metadata and the response as an attribute.  The results file format is not important.</p>

<h3 id="step-5---scoring">Step 5 - Scoring</h3>

<p>Now the tricky part, scoring.  What we need is a very exact number for our F1 calculation.  But we have a text answer coming back.  Consider this answer for our <em>hello world</em> question:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="nd">println!</span><span class="p">(</span><span class="s">"Hello, World!"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It’s an exact match.  So, let’s talk about how this is scored.  F1 score is calculated from 3 metrics around Precision and Recall:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>True Positives (TP): 1 (since the generated code is correct)
False Positives (FP): 0 (since there are no incorrect results generated)
False Negatives (FN): 0 (since there are no correct results missed)

Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
</code></pre></div></div>

<p>Precision is sort of like a function of quality and Recall is a function of quantity.  They both have to be considered at the same time.  In this case (one question, one answer) our F1 score is <code class="language-plaintext highlighter-rouge">1.0</code>.  It doesn’t get any better than that.  But our scorer is way too simple.  Consider this answer:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// this program prints hello world</span>
<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="nd">println!</span><span class="p">(</span><span class="s">"Hello, World!"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We said that the generated code must match the answer.  It doesn’t match the answer, even if we (as humans) know that code comments are ok.  A tool like <code class="language-plaintext highlighter-rouge">grep</code> would fail on the complete string match of our Test Data answers</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>True Positives (TP): 0 (since the generated code is incorrect)
False Positives (FP): 1 (since there are no incorrect results generated)
False Negatives (FN): 0 (since there are no correct results missed)
</code></pre></div></div>

<p>What is our score now?  It’s <code class="language-plaintext highlighter-rouge">0.0</code>.  But we (as humans) know that comments are ok, even encouraged in some cases.  So, how could we get around this?  We could pre-process our results to trim code comments to handle this specific case.  What else could we do?  Well, things get trickier after this point.  Our scorer would have to improve or we’d have to use random sampling (as human domain experts) or other techniques to score our results.  If we can make our scorer smarter then there is less and less manual work to do.  Maybe we could use <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a> to fuzzy match bits of the program.  Maybe we could break it apart with a parser.  Maybe we could just run it and capture the results.  Otherwise, lots of manual work.</p>

<p>In the end, we might be able to score an answer like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">message</span> <span class="o">=</span> <span class="nd">format!</span><span class="p">(</span><span class="s">"Hello, World!"</span><span class="p">);</span>
    <span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="n">message</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Eh, that’s not really what we wanted but it’s ok.  I give this a code quality score of 4/10 while marking it a True Positive.  Then we assign weights and review feedback as a group.  This process is very subjective and tricky.  This is also why F1 scores cannot be used in a vacuum.  Developers <a href="https://en.wikipedia.org/wiki/The_Computer_Language_Benchmarks_Game">are used to</a> something like this.</p>

<p>The scoring process might take even more months to do and many iterations.  In the end, you would hopefully end up with a metric that you trust.  This would also be something you would revisit like other performance benchmarks.</p>

<h3 id="wrap-up">Wrap Up</h3>

<p>Note how we had to have many questions and answer pairs.  Notice how we exploited certain truths about the data that we knew as humans:</p>

<ol>
  <li>We know Hello World should be an exact match because it’s a simple program with <a href="https://en.wikipedia.org/wiki/%22Hello,_World!%22_program">a long history</a>.  In this case we are exploiting some string matching on <code class="language-plaintext highlighter-rouge">Hello, World!"</code> as the message to print.</li>
  <li>We know that code comments are not important for our pre-processor.  We can exploit the beginning of lines <code class="language-plaintext highlighter-rouge">//</code> are filtered out or something.</li>
  <li>We can possibly exploit string nearness with <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a> although this is most likely a bad path to go down.</li>
  <li>We know programs are executable so we could just run the thing and see what happens.  This might involve changing our approach from string matching to having our ground-truth Test Data be actual Rust tests.</li>
</ol>

<h2 id="conclusion">Conclusion</h2>

<p>The main point I’m trying to make is that in some cases, AI/ML have had a different approach to solving problems than general software and I think machine learning evaluations need to be more understood especially by those who are integrating.</p>

<p>Testing randomness is solvable and recognizable in general software.  I think I at least showed two basic approaches in about a paragraph while explaining evaluations even at a high level took a couple of pages.  I have read a lot of people telling one-off stories about ChatGPT making mistakes.  I think these are fine conversations to have as almost UX feedback but not as formal evaluations.  Formal is just the term people use for using a methodology.</p>

<p>This gets even more complicated when people who are integrating LLMs for the first-time and wondering if their idea “works”.  In general software, we usually avoid this problem with other tricks like mocking or seeding.  I’m always curious how they are going to find out if they have not generated 100s of test data questions or know what an F1 score is.  I barely understand evaluations and certainly have not designed one.  I just have been around it a bit.</p>

<p>At the same time, I saw a determinism commonality between PRNGs and AI models and wanted to write a bit about its similarity.  If we were testing the PRNG itself, we might follow a similar approach where we try many executions, collect the results and try to analyze it (maybe check its distribution).  This is different than deterministic functional testing.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Let’s say we had a program that interacts with something that is random. This could be a pseudorandom number generator (PRNG) we can sort of control or it could be something out of our control. Let’s step through three examples, where the last one is random but wildly different. Our program depends on this function. How do we know our program works? Mock the Randomness Our program is just consuming the random function. If we print out the random number, we just need a number. We don’t care if the number is random or not. So the answer is simple: just avoid the randomness of the function. It’s the XKCD comic 221. int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } So, we could basically do the same by mocking out the random function to return 4. Problem avoided. Leverage the Seed If we need to functionally test what happens in different cases, maybe mocking the random function isn’t good enough. What if the number 9 crashes our program for some reason. We want 9 to be returned by the random number generator but also in a more complicated program (say a map generator), we might want to reproduce the state exactly for when 9 happens. This is pretty easy to deal with, we just pass a seed in. Many things use this technique: A test suite’s order in pytest-random-order and rspec. The minecraft /seed command so your friends can play the map you have. An avatar generator might have a seed as well as other inputs. Using the seed, you can replay or inspect the randomness as a way to make the behavior deterministic. Not everything can be made deterministic in a sort of clean room setting. The next example is very different. AI Models Let’s take a sharp turn and assume that the context you are in is AI, Machine Learning (ML) or LLMs. Suddenly we are encountering a very different kind of randomness. These models are not deterministic and they don’t have a seed. We can test our program around the randomness with mocking, perhaps. But what about the model itself? What if we need to know that it works? So, what are we talking about here? How do we know our model works? How do you test a model? How do you compare two models? How do you figure out if you have improved your model when you release a new version? How do we know if ChatGPT 4 is better than 3.5 beyond anecdotes? How does OpenAI determine that ChatGPT 4 is so good that it should be a paid upgrade over 3.5? The answer is evaluation but I want to break down what an evaluation is a bit. I have to caveat that my experience with ML evaluations is mostly surface level from proximity to research, PHDs, etc. I have never designed an evaluation, so my numbers here might be very rough. I’ll also caveat that my specific examples (like Llama 2) are locked in time to the publish date of this post. Let’s say that we wanted to see if Llama 2 is worse at generating Rust code than Llama 2 Code is. Seems reasonsable, right? Llama Code was made for coding and Llama was not. So, how do we really know? The models are not deterministic. We can ask it the same question and we will get different answers. We can’t mock anything because we don’t want to test the code around the model, we want to test the model. There is no seed or anything to make it deterministic. And, we can’t inspect the model because these models are not really explainable (or at least easily). So, our previous approaches don’t work. We need a new approach. The new approach will be to throw experiments at it and measure our results using some methodology. There are many methodologies but for the sake of brevity, we’ll use the classic F1 score. I’m not going to go into how that works but how the evaluation would roughly be run. The rough steps are going to be: Generate or obtain ground truth questions and answers and call it the “Test Data”. Keep the answers secret from the model. Give the Llama 2 model each question from the Test Data and note its answer. Do the same for the Llama 2 Code model. Score the results. These steps are easier said than done. Let’s try to break this down even more. What is involved in step 1? Step 1 - Generate Rust Questions One question will be this: Generate hello world in Rust. And the answer we expect is this: fn main() { println!("Hello, World!"); } We (as humans) all agree that this is the result that we expect given the question/prompt. No more context is allowed. This is one question and answer pair in our “Test Data” set. We might aim for 200 or 1,000 such questions. We might aim for a diverse set of questions, things way beyond hello world. As we select what we care about, we add bias to our evaluation. But bias is a different topic. Step 2 is simply file organization, really. This is more important if we were fine-tuning Llama or making a model ourselves. But since Llama 2 models are already done, we can just move on. As a side note, sometimes generating ground truth and test data can take a team of many people, many months to create. Step 3 and 4 - Runs For each question, run the model with the question. If we have Llama running somewhere, hit the API and record the result. We’ll get to the hard problem in Step 5. Repeat the calls for the other model. Organize the results. Maybe you have something like this: results/llama_2/0001.json results/llama_2/0002.json ... results/llama_2_code/0001.json results/llama_2_code/0002.json ... The results files might be API JSON responses or a file format we’ve invented with metadata and the response as an attribute. The results file format is not important. Step 5 - Scoring Now the tricky part, scoring. What we need is a very exact number for our F1 calculation. But we have a text answer coming back. Consider this answer for our hello world question: fn main() { println!("Hello, World!"); } It’s an exact match. So, let’s talk about how this is scored. F1 score is calculated from 3 metrics around Precision and Recall: True Positives (TP): 1 (since the generated code is correct) False Positives (FP): 0 (since there are no incorrect results generated) False Negatives (FN): 0 (since there are no correct results missed) Precision = TP / (TP + FP) Recall = TP / (TP + FN) Precision is sort of like a function of quality and Recall is a function of quantity. They both have to be considered at the same time. In this case (one question, one answer) our F1 score is 1.0. It doesn’t get any better than that. But our scorer is way too simple. Consider this answer: // this program prints hello world fn main() { println!("Hello, World!"); } We said that the generated code must match the answer. It doesn’t match the answer, even if we (as humans) know that code comments are ok. A tool like grep would fail on the complete string match of our Test Data answers True Positives (TP): 0 (since the generated code is incorrect) False Positives (FP): 1 (since there are no incorrect results generated) False Negatives (FN): 0 (since there are no correct results missed) What is our score now? It’s 0.0. But we (as humans) know that comments are ok, even encouraged in some cases. So, how could we get around this? We could pre-process our results to trim code comments to handle this specific case. What else could we do? Well, things get trickier after this point. Our scorer would have to improve or we’d have to use random sampling (as human domain experts) or other techniques to score our results. If we can make our scorer smarter then there is less and less manual work to do. Maybe we could use Levenshtein distance to fuzzy match bits of the program. Maybe we could break it apart with a parser. Maybe we could just run it and capture the results. Otherwise, lots of manual work. In the end, we might be able to score an answer like this: fn main() { let message = format!("Hello, World!"); println!("{}", message); } Eh, that’s not really what we wanted but it’s ok. I give this a code quality score of 4/10 while marking it a True Positive. Then we assign weights and review feedback as a group. This process is very subjective and tricky. This is also why F1 scores cannot be used in a vacuum. Developers are used to something like this. The scoring process might take even more months to do and many iterations. In the end, you would hopefully end up with a metric that you trust. This would also be something you would revisit like other performance benchmarks. Wrap Up Note how we had to have many questions and answer pairs. Notice how we exploited certain truths about the data that we knew as humans: We know Hello World should be an exact match because it’s a simple program with a long history. In this case we are exploiting some string matching on Hello, World!" as the message to print. We know that code comments are not important for our pre-processor. We can exploit the beginning of lines // are filtered out or something. We can possibly exploit string nearness with Levenshtein distance although this is most likely a bad path to go down. We know programs are executable so we could just run the thing and see what happens. This might involve changing our approach from string matching to having our ground-truth Test Data be actual Rust tests. Conclusion The main point I’m trying to make is that in some cases, AI/ML have had a different approach to solving problems than general software and I think machine learning evaluations need to be more understood especially by those who are integrating. Testing randomness is solvable and recognizable in general software. I think I at least showed two basic approaches in about a paragraph while explaining evaluations even at a high level took a couple of pages. I have read a lot of people telling one-off stories about ChatGPT making mistakes. I think these are fine conversations to have as almost UX feedback but not as formal evaluations. Formal is just the term people use for using a methodology. This gets even more complicated when people who are integrating LLMs for the first-time and wondering if their idea “works”. In general software, we usually avoid this problem with other tricks like mocking or seeding. I’m always curious how they are going to find out if they have not generated 100s of test data questions or know what an F1 score is. I barely understand evaluations and certainly have not designed one. I just have been around it a bit. At the same time, I saw a determinism commonality between PRNGs and AI models and wanted to write a bit about its similarity. If we were testing the PRNG itself, we might follow a similar approach where we try many executions, collect the results and try to analyze it (maybe check its distribution). This is different than deterministic functional testing.]]></summary></entry><entry><title type="html">The Bet Against Web Tech</title><link href="https://squarism.com/2021/12/03/the-bet-against-webtech/" rel="alternate" type="text/html" title="The Bet Against Web Tech" /><published>2021-12-03T00:00:00-08:00</published><updated>2021-12-03T00:00:00-08:00</updated><id>https://squarism.com/2021/12/03/the-bet-against-webtech</id><content type="html" xml:base="https://squarism.com/2021/12/03/the-bet-against-webtech/"><![CDATA[<p>Sometimes I will run into a comment or an opinion that basically boils down to a bet against web technology.  I wanted to collect my thoughts on this.  First I want to talk about GUIs, layout and web views and then I will collect a surprising list of native APIs and their Web equivalents.</p>

<h2 id="example-bets-in-the-gui-domain">Example Bets in the GUI Domain</h2>

<blockquote>
  <p>Someone somewhere: <br />
“Electron is too slow and Qt is the future, $25 on Qt please.”</p>
</blockquote>

<p>I agree on the impulse.  I don’t agree on the bet. There’s a problem of feasibility, complete project context but also just historical trends.  There are plenty of debates already online so there’s no need to rehash them here.  I wanted to instead just focus on one aspect of this which is layout.</p>

<p>Layout has been implemented many times.  Almost none of these technologies have gathered human effort like the web has but lets consider some past examples.  In the early Web, if you wanted lots of functionality (or all you knew is one tech stack) you had to reach for Java applets, Flash or some other browser plugin.</p>

<p>So if you picked Java to do a form, you would pick a layout class to use.  One is the <a href="https://docs.oracle.com/javase/tutorial/uiswing/layout/gridbag.html">GridBagLayout</a>.  An applet might have used this instead of form markup plus styling.</p>

<p><img alt="Grid Bag" style="width: 40%; margin: auto;" src="/uploads/2021/GridBagLayout.png" /></p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">button</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">JButton</span><span class="o">(</span><span class="s">"5"</span><span class="o">);</span>
<span class="n">c</span><span class="o">.</span><span class="na">ipady</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>       <span class="c1">//reset to default</span>
<span class="c1">// snip ...</span>
<span class="n">c</span><span class="o">.</span><span class="na">insets</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Insets</span><span class="o">(</span><span class="mi">10</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">);</span>  <span class="c1">//top padding</span>
<span class="n">c</span><span class="o">.</span><span class="na">gridx</span> <span class="o">=</span> <span class="mi">1</span><span class="o">;</span>       <span class="c1">//aligned with button 2</span>
<span class="n">c</span><span class="o">.</span><span class="na">gridwidth</span> <span class="o">=</span> <span class="mi">2</span><span class="o">;</span>   <span class="c1">//2 columns wide</span>
<span class="n">c</span><span class="o">.</span><span class="na">gridy</span> <span class="o">=</span> <span class="mi">2</span><span class="o">;</span>       <span class="c1">//third row</span>
<span class="n">pane</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="n">button</span><span class="o">,</span> <span class="n">c</span><span class="o">);</span>
</code></pre></div></div>

<p>Of course there might have been tools to help you generate these layouts but this is essentially the CSS of Java GUIs.  If you squint hard enough, you can almost see a stylesheet in there.  They call it insets whereas CSS would call it padding.</p>

<p><a href="https://doc.qt.io/qt-5/layout.html">Qt does a similar thing</a>.  This is not me hating on Qt.  I breathe a sigh of relief when I can use Qt.  It’s quick, it’s light.  It looks nice when the scope is small.  I don’t want Qt apps to disappear.</p>

<h3 id="the-bet-over-time">The Bet Over Time</h3>

<p>So given a goal of distributing a GUI form with project and team constraints, maybe you would select Java and its GridBag.  But this technology is not compatible or related to the web version you didn’t write.  Over time (with hindsight), this turned out to not be a good bet.  Flex, Flash, Shockwave, Applets, Silverlight, ActiveX have come and gone and the pattern is still repeating today.  The web tech version we have now has not been perfect and <a href="https://neutralino.js.org/">I understand the critics</a>.</p>

<p>I would instead bet that distribution, updates, marketing, docs, interop and many other aspects of this hypothetical Java Applet project would eventually need something that is adjacent to web tech.  Maybe the page itself that contains the “active form” or the rich email that you will later send.</p>

<p>Web tech doesn’t auto-win.  I still like text.  TUIs are great (but perhaps a concession).  Native mobile is tricky and no one I talk to <em>really</em> likes generators or <a href="https://cordova.apache.org/">abstraction layers</a> they are using (but this is 2nd hand).</p>

<p>Regardless of current abstractions, I feel that complete web tech avoidance is a liability and the implementations can be fixed.  I have run into terrible web apis written late or poorly as a feature reluctantly bolted on.  Misformed XML, weird JSON, wrong verbs.  Whatever a bad API is, it’s usually not coming from a web-native team or culture.  These are not specialized projects with an exempt domain.  Is that the bet coming due?</p>

<p>If I had to bet I would bet that someone is going to solve Electron’s slowness before Qt displaces web tech or find a performance solution in general.  The best thing would be to have a performance workaround or solution while keeping the Web APIs to enable the most interop and reuse.  Then I’d rather adopt web tech related skills for the team.</p>

<p>The internal dichotomy I have is considering that I quite like Xcode from what I’ve used and I can’t imagine how you would have both.  Trying to use web tech naturally leads to a web view which would have to have an entire engine in it.  Now we don’t have web with applet/flash plugins; we have native code with a web plugin.  It’s just swapping the framing.  If browsers on the desktop can do lightweight native web tech then mobile will too, that’s my hope anyway.</p>

<h2 id="many-domains-one-tech-stack">Many Domains, One Tech Stack</h2>

<p>The list of things web tech is solving is increasing.  There’s very little left untouched.  I’m almost speechless.  I firmware flashed a teensy board using <a href="https://wicg.github.io/webusb/">WebUSB</a>.  I changed a configuration on an audio interface using <a href="https://www.w3.org/TR/webmidi/">WebMIDI</a>.  Someone told me that their browser was opening native files and autosaving to their filesystem and I said <em>“that’s impossible without a security vuln or something”</em> and lo and behold, <a href="https://developer.mozilla.org/en-US/docs/Web/API/FileSystem">I was extremely incorrect</a>.</p>

<p>In this vein, here is a list of technologies which were server-side, native or sacredly impossible to have a web alternative and now are in use or soon to be.</p>

<table>
  <thead>
    <tr>
      <th>Technology</th>
      <th>The Web Tech Version</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Unix Sockets</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/WebSocket">Websockets</a></td>
    </tr>
    <tr>
      <td>OpenGL</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API">WebGL</a></td>
    </tr>
    <tr>
      <td>sqlite or small caches</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Storage_API">localStorage</a></td>
    </tr>
    <tr>
      <td>MIDI</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_MIDI_API">WebMIDI</a></td>
    </tr>
    <tr>
      <td>Assembly</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/WebAssembly">WebAssembly</a></td>
    </tr>
    <tr>
      <td>Bluetooth</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Bluetooth_API">WebBluetooth</a></td>
    </tr>
    <tr>
      <td>Filesystem</td>
      <td><a href="https://developer.mozilla.org/en-US/docs/Web/API/FileSystem">Native Filesystem API</a></td>
    </tr>
  </tbody>
</table>

<p>The list continues with similarities like what <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API">Web Workers</a> would equivocate to in an operating system context.  The list of things web tech is not solving is small regardless of what I think.</p>

<p>When I flashed a development board over WebUSB, there were two options: a binary or use the browser.  I used the browser.  Zero install and they can control distribution and the environment.</p>

<p><img alt="flashing firmware over webusb" style="width: 80%; margin: auto;" src="/uploads/2021/webusb.png" /></p>

<p>Look at the instructions at the bottom.  Visit <code class="language-plaintext highlighter-rouge">chrome://.../usbDevices</code>?  Amazing.</p>

<h2 id="the-web-is-the-biggest-target">The Web is the Biggest Target</h2>

<p>The web platform is the largest there is.  The <a href="https://developer.mozilla.org/en-US/docs/Web/API">list of technologies</a> is large.  The exclusivity and importance of the operating system is ending and there is a focus and a force by all of us arriving and contributing to a single stack instead of reimplementing bespoke things over and over again.  If WebWorkers give you something like threads, why not just use it “for free” with an extremely easy distribution model versus trying to package and maintain Windows/Mac/Linux once again?</p>

<p>It’s not all roses.  I have a lot to say about nits and niggles in the <em>web tech</em> space but that will just have to be another post.  This topic can extend easily to backend web frameworks with a javascript avoidance bias but I want to keep this focused.  Consider these equivalent technologies and the problem with GUI technologies when betting against web tech.  Without a major black swan event, I don’t see these technologies (and then naturally skills) going away soon.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Sometimes I will run into a comment or an opinion that basically boils down to a bet against web technology. I wanted to collect my thoughts on this. First I want to talk about GUIs, layout and web views and then I will collect a surprising list of native APIs and their Web equivalents. Example Bets in the GUI Domain Someone somewhere: “Electron is too slow and Qt is the future, $25 on Qt please.” I agree on the impulse. I don’t agree on the bet. There’s a problem of feasibility, complete project context but also just historical trends. There are plenty of debates already online so there’s no need to rehash them here. I wanted to instead just focus on one aspect of this which is layout. Layout has been implemented many times. Almost none of these technologies have gathered human effort like the web has but lets consider some past examples. In the early Web, if you wanted lots of functionality (or all you knew is one tech stack) you had to reach for Java applets, Flash or some other browser plugin. So if you picked Java to do a form, you would pick a layout class to use. One is the GridBagLayout. An applet might have used this instead of form markup plus styling. button = new JButton("5"); c.ipady = 0; //reset to default // snip ... c.insets = new Insets(10,0,0,0); //top padding c.gridx = 1; //aligned with button 2 c.gridwidth = 2; //2 columns wide c.gridy = 2; //third row pane.add(button, c); Of course there might have been tools to help you generate these layouts but this is essentially the CSS of Java GUIs. If you squint hard enough, you can almost see a stylesheet in there. They call it insets whereas CSS would call it padding. Qt does a similar thing. This is not me hating on Qt. I breathe a sigh of relief when I can use Qt. It’s quick, it’s light. It looks nice when the scope is small. I don’t want Qt apps to disappear. The Bet Over Time So given a goal of distributing a GUI form with project and team constraints, maybe you would select Java and its GridBag. But this technology is not compatible or related to the web version you didn’t write. Over time (with hindsight), this turned out to not be a good bet. Flex, Flash, Shockwave, Applets, Silverlight, ActiveX have come and gone and the pattern is still repeating today. The web tech version we have now has not been perfect and I understand the critics. I would instead bet that distribution, updates, marketing, docs, interop and many other aspects of this hypothetical Java Applet project would eventually need something that is adjacent to web tech. Maybe the page itself that contains the “active form” or the rich email that you will later send. Web tech doesn’t auto-win. I still like text. TUIs are great (but perhaps a concession). Native mobile is tricky and no one I talk to really likes generators or abstraction layers they are using (but this is 2nd hand). Regardless of current abstractions, I feel that complete web tech avoidance is a liability and the implementations can be fixed. I have run into terrible web apis written late or poorly as a feature reluctantly bolted on. Misformed XML, weird JSON, wrong verbs. Whatever a bad API is, it’s usually not coming from a web-native team or culture. These are not specialized projects with an exempt domain. Is that the bet coming due? If I had to bet I would bet that someone is going to solve Electron’s slowness before Qt displaces web tech or find a performance solution in general. The best thing would be to have a performance workaround or solution while keeping the Web APIs to enable the most interop and reuse. Then I’d rather adopt web tech related skills for the team. The internal dichotomy I have is considering that I quite like Xcode from what I’ve used and I can’t imagine how you would have both. Trying to use web tech naturally leads to a web view which would have to have an entire engine in it. Now we don’t have web with applet/flash plugins; we have native code with a web plugin. It’s just swapping the framing. If browsers on the desktop can do lightweight native web tech then mobile will too, that’s my hope anyway. Many Domains, One Tech Stack The list of things web tech is solving is increasing. There’s very little left untouched. I’m almost speechless. I firmware flashed a teensy board using WebUSB. I changed a configuration on an audio interface using WebMIDI. Someone told me that their browser was opening native files and autosaving to their filesystem and I said “that’s impossible without a security vuln or something” and lo and behold, I was extremely incorrect. In this vein, here is a list of technologies which were server-side, native or sacredly impossible to have a web alternative and now are in use or soon to be. Technology The Web Tech Version Unix Sockets Websockets OpenGL WebGL sqlite or small caches localStorage MIDI WebMIDI Assembly WebAssembly Bluetooth WebBluetooth Filesystem Native Filesystem API The list continues with similarities like what Web Workers would equivocate to in an operating system context. The list of things web tech is not solving is small regardless of what I think. When I flashed a development board over WebUSB, there were two options: a binary or use the browser. I used the browser. Zero install and they can control distribution and the environment. Look at the instructions at the bottom. Visit chrome://.../usbDevices? Amazing. The Web is the Biggest Target The web platform is the largest there is. The list of technologies is large. The exclusivity and importance of the operating system is ending and there is a focus and a force by all of us arriving and contributing to a single stack instead of reimplementing bespoke things over and over again. If WebWorkers give you something like threads, why not just use it “for free” with an extremely easy distribution model versus trying to package and maintain Windows/Mac/Linux once again? It’s not all roses. I have a lot to say about nits and niggles in the web tech space but that will just have to be another post. This topic can extend easily to backend web frameworks with a javascript avoidance bias but I want to keep this focused. Consider these equivalent technologies and the problem with GUI technologies when betting against web tech. Without a major black swan event, I don’t see these technologies (and then naturally skills) going away soon.]]></summary></entry><entry><title type="html">The Docker Image Store Is Cache</title><link href="https://squarism.com/2021/11/05/the-docker-image-store-is-cache/" rel="alternate" type="text/html" title="The Docker Image Store Is Cache" /><published>2021-11-05T00:00:00-07:00</published><updated>2021-11-05T00:00:00-07:00</updated><id>https://squarism.com/2021/11/05/the-docker-image-store-is-cache</id><content type="html" xml:base="https://squarism.com/2021/11/05/the-docker-image-store-is-cache/"><![CDATA[<p>When you type <code class="language-plaintext highlighter-rouge">docker images</code> you get a list of docker images on your system.  The image itself is basically a tar file with a content hash.  It’s cryptographically guaranteed (like git) to be the content you want because of this hash.  You have a local image store because it’s much faster to load content locally than over the internet.  So, in this way, docker images are a cache like any other cache.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>docker images
REPOSITORY    TAG       IMAGE ID       CREATED         SIZE
&lt;none&gt;        &lt;none&gt;    77af4d6b9913   19 hours ago    1.089 GB
postgres      latest    746b819f315e   4 days ago      213.4 MB
</code></pre></div></div>

<h2 id="cache-invalidation-is-hard">Cache Invalidation is Hard</h2>

<p>What’s the hardest problem with caching?  Expiration!  So knowing when to “expire” a docker image must be pretty tricky.  Spoiler: it is.  First, let’s talk about why we’d want to expire or even manage our docker images to begin with.</p>

<p>Docker has this image store but in the beginning they didn’t have a clear procedure for what you were supposed to do when you accumulate more and more images.  You either fill your physical disk on Linux or fill a virtual disk on Mac/Windows and then docker stops working.  There are <a href="https://github.com/docker/for-mac/issues/371">countless threads</a> about the <code class="language-plaintext highlighter-rouge">qcow</code> file but there was never any guidance as to what people were supposed to do to manage the images.</p>

<p>Spotify created <a href="https://github.com/spotify/docker-gc">a helper script called</a> <code class="language-plaintext highlighter-rouge">docker-gc</code> to help manage this problem.  It’s been archived in favor of <code class="language-plaintext highlighter-rouge">docker system prune</code> but this is not a complete solution.  Removing images is like expiring the cache, the tricky part is knowing when to do so.</p>

<p>When you are building images, docker will print out intermediate steps as SHAs that you can enter and debug.  When you tag a SHA hash, you are tagging a <code class="language-plaintext highlighter-rouge">&lt;none&gt;</code>.  If you <code class="language-plaintext highlighter-rouge">docker system prune</code> then you could throw away data you care about and <code class="language-plaintext highlighter-rouge">image prune</code> is not much better.  More so, you are blowing away your cache so builds will take longer.  Spotify’s docker-gc let you specify images you want to keep which is useful for the previous examples when you are making images.</p>

<p>Sometimes, people suggest cron’ing the prune so the disk never fills up.  If I cron <code class="language-plaintext highlighter-rouge">docker system prune</code> I’ll lose data (or at least cache hits) every day and might not know why.  And it’s still not solving expiring the cache.  Eventually your disk could fill up with tagged images and <code class="language-plaintext highlighter-rouge">system prune</code> will not have saved you.  Your image store is a cache and you can’t tell if it’s out of date or full of unused things.  What’s worse is, given no running containers these prune commands will basically empty your image store.  So what’s the point of the image store?</p>

<p>You could devise a way to filter docker images using Go templates but this is far too advanced for the use case Docker is aiming for.</p>

<h2 id="my-suggestions">My Suggestions</h2>

<p>I don’t have a list of quick fixes for the difficult problem of cache invalidation.  However, in a perfect world:</p>

<ol>
  <li>Docker would run on all OS’s natively so the disk usage would be more obvious.</li>
  <li>Keep using docker-gc and ignore their advice.</li>
  <li>Write an updater that works like <code class="language-plaintext highlighter-rouge">apt</code> where it tells you that your images are old.  This isn’t easy but this is event-driven and not time-driven.  Event-driven caches are more precise.</li>
  <li>Write an alternate utility that works like docker-gc using Go templates to tag and manage images.</li>
  <li>Docker would provide tooling or some advice as to what to do when people use their product enough to fill their disk.</li>
  <li>Run a local caching server to mitigate internet pulls?  You’re still building all the time and losing partial images if you are cron’ing or running prune.</li>
  <li>Do the typical cache shrug thing of using time as when to expire.  <code class="language-plaintext highlighter-rouge">docker system prune -af --filter "until=$((30*24))h"</code> You’ll lose cache hits and lose data, just delayed.  At least you’ll eventually get new images of <code class="language-plaintext highlighter-rouge">latest</code> I guess.</li>
</ol>

<p>I hope this post explains a bit of what’s really going on and why this is so tricky.  The docker image store is a cache.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[When you type docker images you get a list of docker images on your system. The image itself is basically a tar file with a content hash. It’s cryptographically guaranteed (like git) to be the content you want because of this hash. You have a local image store because it’s much faster to load content locally than over the internet. So, in this way, docker images are a cache like any other cache. $ docker images REPOSITORY TAG IMAGE ID CREATED SIZE &lt;none&gt; &lt;none&gt; 77af4d6b9913 19 hours ago 1.089 GB postgres latest 746b819f315e 4 days ago 213.4 MB Cache Invalidation is Hard What’s the hardest problem with caching? Expiration! So knowing when to “expire” a docker image must be pretty tricky. Spoiler: it is. First, let’s talk about why we’d want to expire or even manage our docker images to begin with. Docker has this image store but in the beginning they didn’t have a clear procedure for what you were supposed to do when you accumulate more and more images. You either fill your physical disk on Linux or fill a virtual disk on Mac/Windows and then docker stops working. There are countless threads about the qcow file but there was never any guidance as to what people were supposed to do to manage the images. Spotify created a helper script called docker-gc to help manage this problem. It’s been archived in favor of docker system prune but this is not a complete solution. Removing images is like expiring the cache, the tricky part is knowing when to do so. When you are building images, docker will print out intermediate steps as SHAs that you can enter and debug. When you tag a SHA hash, you are tagging a &lt;none&gt;. If you docker system prune then you could throw away data you care about and image prune is not much better. More so, you are blowing away your cache so builds will take longer. Spotify’s docker-gc let you specify images you want to keep which is useful for the previous examples when you are making images. Sometimes, people suggest cron’ing the prune so the disk never fills up. If I cron docker system prune I’ll lose data (or at least cache hits) every day and might not know why. And it’s still not solving expiring the cache. Eventually your disk could fill up with tagged images and system prune will not have saved you. Your image store is a cache and you can’t tell if it’s out of date or full of unused things. What’s worse is, given no running containers these prune commands will basically empty your image store. So what’s the point of the image store? You could devise a way to filter docker images using Go templates but this is far too advanced for the use case Docker is aiming for. My Suggestions I don’t have a list of quick fixes for the difficult problem of cache invalidation. However, in a perfect world: Docker would run on all OS’s natively so the disk usage would be more obvious. Keep using docker-gc and ignore their advice. Write an updater that works like apt where it tells you that your images are old. This isn’t easy but this is event-driven and not time-driven. Event-driven caches are more precise. Write an alternate utility that works like docker-gc using Go templates to tag and manage images. Docker would provide tooling or some advice as to what to do when people use their product enough to fill their disk. Run a local caching server to mitigate internet pulls? You’re still building all the time and losing partial images if you are cron’ing or running prune. Do the typical cache shrug thing of using time as when to expire. docker system prune -af --filter "until=$((30*24))h" You’ll lose cache hits and lose data, just delayed. At least you’ll eventually get new images of latest I guess. I hope this post explains a bit of what’s really going on and why this is so tricky. The docker image store is a cache.]]></summary></entry><entry><title type="html">Toxic Places with No Inputs</title><link href="https://squarism.com/2021/10/29/toxic-places-with-no-inputs/" rel="alternate" type="text/html" title="Toxic Places with No Inputs" /><published>2021-10-29T00:00:00-07:00</published><updated>2021-10-29T00:00:00-07:00</updated><id>https://squarism.com/2021/10/29/toxic-places-with-no-inputs</id><content type="html" xml:base="https://squarism.com/2021/10/29/toxic-places-with-no-inputs/"><![CDATA[<p>I went to a talk by <a href="https://en.wikipedia.org/wiki/42_Entertainment">Susan Bonds</a> who worked on projects like I Love Bees, a Christopher Nolan project and a <a href="https://en.wikipedia.org/wiki/Trent_Reznor">Trent Reznor</a> augmented reality project.  I was fortunate enough to be the only one who knew who Trent Reznor was at the talk so when we went to lunch I got to sit across from her at the lunch table and everyone just listened to us talk.  It was a strange experience.  This was a long time ago but I want to talk about a very particular thing which has stayed with me.</p>

<p>Susan was hired to inject some life into the Nine Inch Nails (NIN) forum.  NIN hadn’t really been making as much music as before and my take was that people were scared that their favorite band (one of my favs too) weren’t making music anymore and the glory days had passed.  Hang onto this idea.</p>

<p><img alt="year zero spectrogram" style="width: 100%; margin: auto;" src="/uploads/2021/year_zero.jpg" /></p>

<p>So she developed a series of augmented reality games and events that reinvigorated the fanbase.  It was an extremely <a href="https://en.wikipedia.org/wiki/Year_Zero_(album)#Promotion">interesting series</a> of hidden mp3 sticks and hidden puzzles, ending in a staged concert that was broken up by fake police and tear gas.  The fans loved it.  They created wikis and collected information.  There were fake shutdowns and images “sent from the future”.  It was an ARG.</p>

<p>You can see in the picture above a spectrogram of the found-in-a-bathroom mp3 file with a hidden image that was itself a pretend “leaked from the future” image of an alien.  The fervor and excitement must have been crazy.  But think about what I said about “Hang onto this idea”.  This entire ARG campaign was started to inject new energy into a forum who were eating themselves alive.  It didn’t go unnoticed.</p>

<p>I think about Slashdot, Perl forums, Usenet and yes even boards I associate myself with like Ruby.  When Ruby came out it was in direct competition with Perl.  Perl gained fear of Ruby.  When Node came out, Ruby gained fear of Node.  When Go came out, Node perhaps gained fear of Go.  And so on to Rust and to Zig and to whatever else is next.  Each generation causes fear from the old.  But here I’m specifically talking about the lack of input and what the forums are like.  Without input, things stew and ferment.</p>

<p>Elixir breathed new life into the Erlang community and Joe Armstrong was happy for it. To me, this is the most mature way to look at it. I can’t imagine average forums of fear having this kind of positive attitude with strangers over anonymous text.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I went to a talk by Susan Bonds who worked on projects like I Love Bees, a Christopher Nolan project and a Trent Reznor augmented reality project. I was fortunate enough to be the only one who knew who Trent Reznor was at the talk so when we went to lunch I got to sit across from her at the lunch table and everyone just listened to us talk. It was a strange experience. This was a long time ago but I want to talk about a very particular thing which has stayed with me. Susan was hired to inject some life into the Nine Inch Nails (NIN) forum. NIN hadn’t really been making as much music as before and my take was that people were scared that their favorite band (one of my favs too) weren’t making music anymore and the glory days had passed. Hang onto this idea. So she developed a series of augmented reality games and events that reinvigorated the fanbase. It was an extremely interesting series of hidden mp3 sticks and hidden puzzles, ending in a staged concert that was broken up by fake police and tear gas. The fans loved it. They created wikis and collected information. There were fake shutdowns and images “sent from the future”. It was an ARG. You can see in the picture above a spectrogram of the found-in-a-bathroom mp3 file with a hidden image that was itself a pretend “leaked from the future” image of an alien. The fervor and excitement must have been crazy. But think about what I said about “Hang onto this idea”. This entire ARG campaign was started to inject new energy into a forum who were eating themselves alive. It didn’t go unnoticed. I think about Slashdot, Perl forums, Usenet and yes even boards I associate myself with like Ruby. When Ruby came out it was in direct competition with Perl. Perl gained fear of Ruby. When Node came out, Ruby gained fear of Node. When Go came out, Node perhaps gained fear of Go. And so on to Rust and to Zig and to whatever else is next. Each generation causes fear from the old. But here I’m specifically talking about the lack of input and what the forums are like. Without input, things stew and ferment. Elixir breathed new life into the Erlang community and Joe Armstrong was happy for it. To me, this is the most mature way to look at it. I can’t imagine average forums of fear having this kind of positive attitude with strangers over anonymous text.]]></summary></entry></feed>