{
	"version": "https://jsonfeed.org/version/1",
	"title": "XlogicX",
	"icon": "https://www.gravatar.com/avatar/c23fcb4fa77c545bf1928f65465cedd5?s=96&d=https%3A%2F%2Fmicro.blog%2Fimages%2Fblank_avatar.png",
	"home_page_url": "https://xlogicx.micro.blog/",
	"feed_url": "https://xlogicx.micro.blog/feed.json",
	"items": [
			{
				"id": "http://xlogicx.micro.blog/2026/06/27/language-is-forgiving-regex-isnt.html",
				"title": "Language is forgiving, Regex isn't",
				"content_html": "<h2 id=\"some-background-context\">Some background context</h2>\n<p>I&rsquo;ve watched the early failings of LLMs as things have progressively improved. Many people point to overconfident hallucinations and inaccuracies. I also noticed they were bad at regular expressions. In the early stages of LLMs being kind of a &lsquo;toy,&rsquo; none of this mattered to me much. And then we started seeing &lsquo;AI&rsquo; being ingested into the pipelines of real products. Yeah there were still overconfident hallucinations used for summaries of things. I was also seeing LLMs being used to generate things where regex was the consumable content (i.e. security detections). The regex was often wrong. If not wrong, poorly formatted and hard to maintain. But what would be harder to detect with a naked eye: never tested to be performant.</p>\n<p>Since then, LLMs have gotten profoundly better, but not for critical aspects of regular expressions. And at the Frontier LLM level, they wont get better for a while. That is what this post is about. I will explain why. I will show my tests. I will offer solutions.</p>\n<h2 id=\"why-fluent-language-models-write-dangerous-regular-expressions\">Why fluent language models write dangerous regular expressions</h2>\n<p>Large language models are now extremely great with language. They draft, summarize, translate, argue, and imitate voice with a fluency that still surprises people who use them daily. However, with regular expressions, they can be subtly wrong, or worse, quietly catastrophic; a pattern that passes every example you test it on and then pins a CPU at 100% the first time it meets an adversarial input.</p>\n<p>Regex is different than natural language. Natural language plays to everything a transformer is built to do. Regular expressions attack the one thing it can&rsquo;t. This piece is about that asymmetry, specifically in the <em>naive</em> setting that matters most in practice: a single pass, no tools, no test harness, the model asked to produce a pattern and trusted to get it right.</p>\n<h2 id=\"llms-are-good-at-language-duh\">LLMs are good at language (duh)</h2>\n<p>A language model is a next-token predictor trained on an insane quantity of human text. Three properties of natural language make that objective and that training data an almost perfect match for the task.</p>\n<p>First, the supervision is enormous and self-describing. Language is the most abundant structured data in existence, and it explains itself as it goes; context, paraphrase, definition, correction. The model sees the same idea expressed ten thousand ways, with the meaning carried in the surrounding words. It learns the distribution of <em>how humans say things</em> because that distribution is written down everywhere.</p>\n<p>Second, language is statistical and redundant. There is rarely a single correct output. A dozen phrasings of the same sentence are all acceptable, and a slightly off word choice still communicates. The objective: produce a likely continuation <em>is</em> the task. The model is rewarded for landing in a large, fuzzy region of &ldquo;good enough,&rdquo; and natural language is full of such regions.</p>\n<p>Third, meaning lives at roughly the granularity the model operates on. Subword tokens and attention over them line up reasonably well with morphemes, words, and phrases, the units where linguistic meaning actually accumulates. The representation and the semantics are, loosely, in the same place. You can train a language model on per character. I have lab&rsquo;d this with some &ldquo;small&rdquo; language models with a corpus like a Bible or the works of H.G. Wells. It&rsquo;s not really the best way to do things, because lots of training is spent on learning how to spell before learning language. Frontier models typically go with the morphemes, words, and phrases.</p>\n<p>Put together: abundant annotated data, a forgiving target with many right answers, and a representation aligned with where meaning lives. Fluency is what you&rsquo;d expect, not what should astonish us.</p>\n<h2 id=\"regex-is-none-of-those-things\">Regex is none of those things</h2>\n<p>A regular expression is not a likely continuation of anything. It is a formal object. It denotes a precise set of strings; a regular language with exact automaton semantics underneath. There is a right answer and an infinitude of wrong ones, and the wrong ones are not gracefully wrong. &ldquo;Close&rdquo; can mean &ldquo;accepts an empty string it should reject,&rdquo; or &ldquo;matches across a delimiter it should respect,&rdquo; or &ldquo;runs in exponential time.&rdquo; None of the forgiveness that makes language tractable survives the jump to a formal notation where every character carries exact composable meaning and a single misplaced quantifier changes the language.</p>\n<p>There are really two distinct ways a generated regex can be wrong, and the model is weak on both for related reasons.</p>\n<p>The first is <strong>correctness</strong>: does the pattern denote the language you intended. For common, heavily-attested formats: email-ish strings, dotted labels, ISO dates, the model has effectively memorized the right shape and reproduces it well. It is pattern-matching against a regex it has seen thousands of times. But hand it a <em>novel</em> rule, one with no canonical regex in the training corpus, and it has to construct the language rather than recall it. Construction is where the cracks show. The model approximates the target by stitching together familiar fragments, and on an unfamiliar specification that approximation drifts, over-accepting here, missing an edge there, because it is imitating the look of correct regex rather than reasoning about what the automaton accepts.</p>\n<p>The second failure is the dangerous one, because it is invisible.</p>\n<h2 id=\"the-invisible-failure-catastrophic-backtracking\">The invisible failure: catastrophic backtracking</h2>\n<p>A regex can be perfectly correct and still be a denial-of-service vulnerability. Patterns with ambiguous, overlapping repetition; the classic <code>(a+)+</code>, or any &ldquo;repeated unit with an optional in-band separator&rdquo;, force a backtracking engine to explore exponentially many ways to partition the input. On a string that <em>almost</em> matches and then fails at the end, the engine tries every partition before giving up. Twenty-six characters can take seconds. A few more, and it never returns. This is ReDoS.</p>\n<p>Here is why the model is structurally blind to it. Catastrophic behavior is a property of the regex&rsquo;s <em>runtime</em>, and runtime appears nowhere in the source text. Regex is typically written once, committed without tests, and never discussed with performance annotations the way code is. So the model has almost no training signal connecting regex syntax to execution cost. It cannot feel that <code>(t+k?)+z</code> is exponential the way an engineer who has been paged at 3 a.m. can, because it has never been paged and has no internal automaton to simulate the blowup. It produces non-catastrophic patterns when it happens to be imitating cautious examples and catastrophic ones when imitating sloppy examples, with no mechanism to tell the two apart.</p>\n<p>In testing this is exactly what you see. Give the model a format whose safe, linear pattern is obvious, and it reliably finds it. Give it a format where the safe pattern requires a non-obvious restructuring, recognizing that an optional separator must be made mandatory between elements and handled separately at the boundary, and it will confidently produce the exponential nested form. The damning part: that output passes every functional test case. It is <em>correct</em>. It just falls off a cliff on an input a junior attacker could craft. The model had no signal that anything was wrong, because nothing about the wrongness is visible in the text it was trained on.</p>\n<h2 id=\"why-scale-wont-simply-fix-it\">Why scale won&rsquo;t simply fix it</h2>\n<p>Underneath both failures is one fact about the architecture: a transformer pattern-matches against a learned distribution. It does not execute automata, and it does not construct them. Natural language rewards exactly that behavior. Regex requires the opposite. Deciding whether a regex denotes the intended language, or whether its NFA has the ambiguous states that cause exponential paths, is automaton-level reasoning; the kind of thing you <em>run</em>, not the kind of thing you pattern-match. The model can approximate it from examples, and on familiar shapes the approximation is good enough to look like understanding. On unfamiliar or adversarial shapes the approximation is all there is, and it is not grounded in the semantics that would make it reliable.</p>\n<p>This is also why the obvious fixes are weaker than they sound. Better tokenization, even character-level doesn&rsquo;t address it, because the bottleneck is computational, not representational. And simply letting the model &ldquo;think harder&rdquo;; more internal reasoning before answering, helps only at the margins, because more reasoning is still more pattern-matching over the same priors. It is not the same as having an automaton to consult. You can expect the rate of dangerous output to drop somewhat as you crank up deliberation, but not to vanish, because the missing capability was never a question of effort.</p>\n<h2 id=\"but-its-great-at-code-the-obvious-objection\">&ldquo;But it&rsquo;s great at code&rdquo;: the obvious objection</h2>\n<p>If LLMs struggle with formal notation, why are they so genuinely useful at writing code, which is also formal? Both things are true, and the reconciliation is instructive. Code generation benefits from compensating factors that regex simply doesn&rsquo;t have:</p>\n<ul>\n<li><strong>Volume and annotation.</strong> Code is the most abundant structured text after prose, and a huge fraction of it is explained: commits, issues, reviews, tests, docs, Q&amp;A sites that spell out <em>why</em> a snippet does what it does. Regex is written once and hardly discussed at the same level.</li>\n<li><strong>Decomposition.</strong> Code breaks into functions and modules; units the size a model handles well. A regex is a single dense expression where every character matters and nothing can be swapped in as a subroutine.</li>\n<li><strong>Mature tool loops.</strong> The standard way to get good code from a model is generate-test-fix against compilers, type checkers, and test suites. That loop barely exists for regex; almost nobody runs a ReDoS checker in their workflow.</li>\n<li><strong>Targeted training.</strong> Code quality has received enormous, deliberate reinforcement. Regex hasn&rsquo;t, because the business case is smaller.</li>\n</ul>\n<p>Strip those four supports away and you get regex. Notably, the same silent-failure class that bites regex also lurks in code: performance bugs, race conditions, inject-able queries, exactly the places where the failure isn&rsquo;t visible in the source and there&rsquo;s no cheap checker in the loop. Regex is just the purest, most concentrated instance of the problem.</p>\n<h2 id=\"poc-or-gtfo\">PoC or GTFO</h2>\n<p>Before getting into the fix, I will show examples of how bad this can get. I did my tests in API mode to better control variables and get closer to stripped down LLM instead of the extended thinking and validation loops that Desktop/Cowork/Code can use. Is this unfair? No for a couple of reasons. When regex generation is pipelined into vendor content, it would typically be at this &lsquo;one-pass&rsquo; API level, not a full agentic Cowork session per expression. But even in Cowork, your expression isn&rsquo;t going to be tested for performance.</p>\n<p>For my tests, I used differing token limits, thinking types, and effort level. Here&rsquo;s an example of the code used with one of those configurations.</p>\n<div class=\"highlight\"><pre tabindex=\"0\" style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-python\" data-lang=\"python\"><span style=\"color:#f92672\">import</span> anthropic\n<span style=\"color:#f92672\">from</span> dotenv <span style=\"color:#f92672\">import</span> load_dotenv; load_dotenv()\n\nVALID <span style=\"color:#f92672\">=</span> <span style=\"color:#e6db74\">&#34;&#34;&#34;</span><span style=\"color:#ae81ff\">\\\n</span><span style=\"color:#ae81ff\"></span><span style=\"color:#e6db74\">tz\n</span><span style=\"color:#e6db74\">ttz\n</span><span style=\"color:#e6db74\">tktz\n</span><span style=\"color:#e6db74\">ttktttz\n</span><span style=\"color:#e6db74\">tkz\n</span><span style=\"color:#e6db74\">ttkttz\n</span><span style=\"color:#e6db74\">tttz\n</span><span style=\"color:#e6db74\">tktktz\n</span><span style=\"color:#e6db74\">tkttz\n</span><span style=\"color:#e6db74\">ttkz\n</span><span style=\"color:#e6db74\">&#34;&#34;&#34;</span>\n\nINVALID <span style=\"color:#f92672\">=</span> <span style=\"color:#e6db74\">&#34;&#34;&#34;</span><span style=\"color:#ae81ff\">\\\n</span><span style=\"color:#ae81ff\"></span><span style=\"color:#e6db74\">z\n</span><span style=\"color:#e6db74\">ktz\n</span><span style=\"color:#e6db74\">tkkz\n</span><span style=\"color:#e6db74\">tt\n</span><span style=\"color:#e6db74\">kz\n</span><span style=\"color:#e6db74\">tzk\n</span><span style=\"color:#e6db74\">t\n</span><span style=\"color:#e6db74\">tk\n</span><span style=\"color:#e6db74\">ztt\n</span><span style=\"color:#e6db74\">ttzz\n</span><span style=\"color:#e6db74\">tttttttttttttttttt\n</span><span style=\"color:#e6db74\">&#34;&#34;&#34;</span>\n\nPROMPT <span style=\"color:#f92672\">=</span> (\n    <span style=\"color:#e6db74\">&#34;Each line below is either a valid tally code or not. A tally code is a series &#34;</span>\n    <span style=\"color:#e6db74\">&#34;of one or more clusters. A cluster is one or more &#39;t&#39; symbols. A cluster may be &#34;</span>\n    <span style=\"color:#e6db74\">&#34;followed by a single linker &#39;k&#39;. Every code ends with the terminator &#39;z&#39;. Write &#34;</span>\n    <span style=\"color:#e6db74\">&#34;a single regular expression that matches every VALID code and none of the INVALID &#34;</span>\n    <span style=\"color:#e6db74\">&#34;strings.</span><span style=\"color:#ae81ff\">\\n\\n</span><span style=\"color:#e6db74\">VALID:</span><span style=\"color:#ae81ff\">\\n</span><span style=\"color:#e6db74\">&#34;</span> <span style=\"color:#f92672\">+</span> <span style=\"color:#e6db74\">&#34;</span><span style=\"color:#ae81ff\">\\n</span><span style=\"color:#e6db74\">&#34;</span><span style=\"color:#f92672\">.</span>join(VALID) <span style=\"color:#f92672\">+</span> <span style=\"color:#e6db74\">&#34;</span><span style=\"color:#ae81ff\">\\n\\n</span><span style=\"color:#e6db74\">INVALID:</span><span style=\"color:#ae81ff\">\\n</span><span style=\"color:#e6db74\">&#34;</span> <span style=\"color:#f92672\">+</span> <span style=\"color:#e6db74\">&#34;</span><span style=\"color:#ae81ff\">\\n</span><span style=\"color:#e6db74\">&#34;</span><span style=\"color:#f92672\">.</span>join(INVALID) <span style=\"color:#f92672\">+</span>\n    <span style=\"color:#e6db74\">&#34;</span><span style=\"color:#ae81ff\">\\n\\n</span><span style=\"color:#e6db74\">Return the regular expression.&#34;</span>\n)\n\nclient <span style=\"color:#f92672\">=</span> anthropic<span style=\"color:#f92672\">.</span>Anthropic()\nresp <span style=\"color:#f92672\">=</span> client<span style=\"color:#f92672\">.</span>messages<span style=\"color:#f92672\">.</span>create(\n    model<span style=\"color:#f92672\">=</span><span style=\"color:#e6db74\">&#34;claude-opus-4-8&#34;</span>,\n    max_tokens<span style=\"color:#f92672\">=</span><span style=\"color:#ae81ff\">21000</span>,\n    thinking<span style=\"color:#f92672\">=</span>{<span style=\"color:#e6db74\">&#34;type&#34;</span>: <span style=\"color:#e6db74\">&#34;adaptive&#34;</span>},\n    output_config<span style=\"color:#f92672\">=</span>{<span style=\"color:#e6db74\">&#34;effort&#34;</span>: <span style=\"color:#e6db74\">&#34;max&#34;</span>},    \n    messages<span style=\"color:#f92672\">=</span>[{<span style=\"color:#e6db74\">&#34;role&#34;</span>: <span style=\"color:#e6db74\">&#34;user&#34;</span>, <span style=\"color:#e6db74\">&#34;content&#34;</span>: PROMPT}],\n)\nprint(next(b<span style=\"color:#f92672\">.</span>text <span style=\"color:#66d9ef\">for</span> b <span style=\"color:#f92672\">in</span> resp<span style=\"color:#f92672\">.</span>content <span style=\"color:#66d9ef\">if</span> b<span style=\"color:#f92672\">.</span>type <span style=\"color:#f92672\">==</span> <span style=\"color:#e6db74\">&#34;text&#34;</span>))\n</code></pre></div><h3 id=\"results\">Results</h3>\n<h4 id=\"tokens--1024-10-tests\">tokens = 1024 (10 tests)</h4>\n<ul>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t(\\nt)*(\\nk)?\\n)*t(\\nt)*(\\nk)?\\nz$</code></li>\n<li><code>^(t(\\nt)*(\\nk)?)(\\n(t(\\nt)*(\\nk)?))*\\nz$</code></li>\n<li><code>(t+k?)+z</code></li>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code></li>\n</ul>\n<h4 id=\"tokens--21000-5-tests\">tokens = 21000 (5 tests)</h4>\n<ul>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code></li>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^t+(kt+)*k?z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n</ul>\n<h4 id=\"tokens--21000-max-effort\">tokens = 21000, max effort</h4>\n<ul>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^\\s*t(\\s+t)*(\\s+k)?(\\s+t(\\s+t)*(\\s+k)?)*\\s+z\\s*$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n</ul>\n<h4 id=\"tokens--21000-max-effort-adaptive-thinking\">tokens = 21000, max effort, adaptive thinking</h4>\n<ul>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n<li><code>^(t+k?)+z$</code></li>\n</ul>\n<h3 id=\"interpretation\">Interpretation</h3>\n<p>How do these rank? When given an adversarial string of 28 <strong>t</strong>&rsquo;s, this is how long each timed on my system in seconds. Here is an example command that I used\n<code>python3 -c &quot;import re,time; s='tttttttttttttttttttttttttttt'; t=time.perf_counter(); re.search(r'^\\s*t(\\s+t)*(\\s+k)?(\\s+t(\\s+t)*(\\s+k)?)*\\s+z\\s*$', s); print('%.4fs'%(time.perf_counter()-t))&quot;</code></p>\n<ul>\n<li><code>(t+k?)+z</code>: 17.4022s</li>\n<li><code>^(t+k?)+z$</code>: 8.8276s</li>\n<li><code>^(t(\\nt)*(\\nk)?\\n)*t(\\nt)*(\\nk)?\\nz$</code>: 0.0001s</li>\n<li><code>^(t(\\nt)*(\\nk)?)(\\n(t(\\nt)*(\\nk)?))*\\nz$</code>: 0.0001s</li>\n<li><code>^t(\\nt)*(\\nk)?(\\nt(\\nt)*(\\nk)?)*\\nz$</code>: 0.0001s</li>\n<li><code>^t+(kt+)*k?z$</code>: 0.0001s</li>\n<li><code>^\\s*t(\\s+t)*(\\s+k)?(\\s+t(\\s+t)*(\\s+k)?)*\\s+z\\s*$</code>: 0.0001s</li>\n</ul>\n<p>Some of the expressions produced were benign. A lot of them were not. The catastrophic 8 second one was produced in the majority of cases across all test types. There was a more catastrophic 17 second one that also occurred in an early case. So this isn&rsquo;t great.</p>\n<h2 id=\"ground-truth-back-in-the-loop\">Ground truth back in the loop</h2>\n<p>None of this means the model is useless for regex. It means <em>unaided</em> generation is unreliable. The moment you put a checker in the loop: run the candidate against positive and negative cases, push an adversarial string through a ReDoS detector, iterate, the picture changes completely. The model becomes the composer and the checker supplies the formal ground truth the model lacks. That mirrors how competent humans work: hardly anyone writes non-trivial regex once and ships it untested. The improvement isn&rsquo;t the model getting better at automata; it&rsquo;s the model being placed in a loop with something that understands them.</p>\n<p>Every mitigation worth the name is the same idea at a different strength: supply the semantics the model can&rsquo;t derive, and make supplying them rigorous and non-optional. Think of this like a ladder, and where you sit on it is the difference between a useful tool and a liability. One warning applies to every rung, because it&rsquo;s the failure that keeps recurring: the checker has to test <em>both</em> correctness and catastrophic backtracking. Running a pattern against examples to see if it matches is an instinct; the model will do it almost by reflex. Pumping an adversarial string through it under a timeout is not, and it&rsquo;s the test that catches the dangerous failure. Whatever rung you&rsquo;re on, if your definition of &ldquo;done&rdquo; doesn&rsquo;t make adversarial performance explicit, you will get a green-lit time bomb.</p>\n<h3 id=\"mcp--skill\">MCP + SKILL</h3>\n<p>The fix is the same shape as everything else in this piece: stop asking the model to be an automaton and hand it one to call. ReDetox is an MCP server that &ldquo;I&rdquo; created that does exactly that. The model still composes the pattern; that&rsquo;s the part it&rsquo;s good enough at, but every claim about whether the pattern is correct and whether it&rsquo;s safe gets settled by the server, not by the model&rsquo;s confidence. ReDetox generates the adversarial input, times the evil string against a benign baseline in a killable subprocess to classify NFA backtracking growth as linear, polynomial, or exponential, sweeps RE2&rsquo;s DFA compile memory to find the blowup cliff, and checks the pattern against intended positives and negatives. What comes back is a structured verdict: accepted or rejected, with the dangerous construct named, that the model iterates against instead of guessing about. The model composes; the server checks.</p>\n<p>None of the measurement logic is new, which is the point. ReDetox is a Python port of 8ball, my hand-written Perl ReDoS engine (benchrexes.pl + nfagen.pl) that&rsquo;s been my source of truth for this for years. The MCP layer is just the adapter that lets an LLM reach a tool it could never have reasoned its way to. The README has the full tool surface: evil/benign generation, the per-engine growth matrix, verified rewrites, the no-silent-success contract, but the load-bearing idea fits in one line: a regex is &ldquo;safe&rdquo; only when correctness, growth, and language equivalence have all been measured, never when it merely looks fine.</p>\n<p>But a coprocessor nobody calls is dead weight, and an LLM left to its own habits will happily skip the tool and eyeball the pattern. The exact failure mode we&rsquo;re trying to kill. That&rsquo;s the other half: a Skill. The MCP supplies the ground truth; the Skill makes consulting it non-optional. It redefines &ldquo;done&rdquo; as concise, correct, and performant, and encodes one rule the model isn&rsquo;t allowed to talk its way around: never pronounce a regex safe from inspection; always measure first. Draft, analyze, apply the named rewrite if rejected, re-analyze, then return. The tool does the checking; the Skill turns &ldquo;the model can write a safe regex if it remembers to test&rdquo; into &ldquo;the model reliably does.&rdquo;</p>\n<p><a href=\"https://github.com/XlogicX/ReDetox\">https://github.com/XlogicX/ReDetox</a></p>\n<h2 id=\"poc-or-gtfo-part-ii\">POC or GTFO Part II</h2>\n<p>I give this MCP + SKILL system a battery of tests to make sure it&rsquo;s doing what it was designed to do.</p>\n<hr>\n<h3 id=\"test-1\">Test 1</h3>\n<h4 id=\"prompt\">Prompt</h4>\n<p>Please use the regex-quality MCP server to determin if this regular expresson:\n^(t+k?)+z$</p>\n<p>Is good for this data:</p>\n<p>VALID:\ntz<br>\nttz<br>\ntktz<br>\nttktttz<br>\ntkz<br>\nttkttz<br>\ntttz<br>\ntktktz<br>\ntkttz<br>\nttkz</p>\n<p>INVALID:\nz<br>\nktz<br>\ntkkz<br>\ntt<br>\nkz<br>\ntzk<br>\nt<br>\ntk<br>\nztt<br>\nttzz<br>\ntttttttttttttttttt</p>\n<h4 id=\"resonse\">Resonse</h4>\n<p>The pattern is <strong>correct but dangerous</strong>. Let me get a verified safe replacement before reporting back.## Verdict: Correct, but <strong>not safe to use as written</strong></p>\n<p>I ran your regex <code>^(t+k?)+z$</code> through the analyzer against all 21 of your test cases. Here&rsquo;s what came back.</p>\n<p><strong>Correctness — passes.</strong> It matches all 10 VALID strings and rejects all 11 INVALID ones. Zero false positives, zero false negatives. So functionally, it captures the language you intended.</p>\n<p><strong>Safety — fails (this is the problem).</strong> The pattern is <strong>vulnerable to catastrophic backtracking (ReDoS)</strong> with <strong>exponential</strong> runtime growth. The culprit is the construct <code>(t+k?)+</code> — a repeated group that mixes a repeatable element (<code>t+</code>) with an optional one (<code>k?</code>). When the trailing <code>z</code> is missing, the engine can split a run of <code>t</code>s an exponentially growing number of ways before giving up.</p>\n<p>The measured timing curve makes this concrete. On an input like <code>tttt...tttkza</code>, each extra pair of characters roughly quadruples the match time:</p>\n<table>\n<thead>\n<tr>\n<th>Input length</th>\n<th>Time</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>17</td>\n<td>0.8 ms</td>\n</tr>\n<tr>\n<td>19</td>\n<td>3.1 ms</td>\n</tr>\n<tr>\n<td>21</td>\n<td>12.9 ms</td>\n</tr>\n<tr>\n<td>23</td>\n<td>50 ms</td>\n</tr>\n<tr>\n<td>25</td>\n<td>197 ms</td>\n</tr>\n<tr>\n<td>27+</td>\n<td>times out</td>\n</tr>\n</tbody>\n</table>\n<p>A ~27-character malicious string already hangs the engine. That&rsquo;s a denial-of-service waiting to happen if this pattern ever touches untrusted input.</p>\n<h5 id=\"recommended-fix\">Recommended fix</h5>\n<p>The server found a verified drop-in replacement that is <strong>provably equivalent</strong> (exact DFA comparison) to your original and runs in <strong>linear</strong> time:</p>\n<pre tabindex=\"0\"><code>^t+(?:kt+)*k?z$\n</code></pre><p>This works by requiring the <code>k</code> to act as a separator between runs of <code>t</code>, which removes the ambiguity that caused the backtracking. It accepts and rejects exactly the same strings as your original — so all 21 of your test cases still pass — but can&rsquo;t be exploited.</p>\n<p>Bottom line: your regex is <em>logically</em> good for this data but should not ship as-is. Swap in <code>^t+(?:kt+)*k?z$</code> and you keep identical behavior without the ReDoS risk.</p>\n<hr>\n<h3 id=\"test-2\">Test 2</h3>\n<p><em>Correct-but-dangerous, no magic words.</em> — Tests triggering on a bare quality question, and whether it separates correctness from safety, names the construct, and offers a <em>verified</em> rewrite.</p>\n<h4 id=\"prompt-1\">Prompt</h4>\n<p>&ldquo;Is this regex any good? <code>^(\\d+\\s*)+$</code> — it should match strings like <code>12 34 567</code> and reject things with letters.&rdquo;</p>\n<h4 id=\"response\">Response</h4>\n<p>Go isn&rsquo;t installed here (an environment gap, not a regex problem), but the more useful signal showed up: the only <em>equivalent</em> safe rewrite is the atomic-group one, which RE2/Go can&rsquo;t parse — and the structural alternative keeps coming back as <strong>not equivalent</strong> to your original. That&rsquo;s worth understanding. Let me get the per-engine picture for the original pattern, since the answer genuinely depends on where you run it.Good question to actually test — this is the classic case where the pattern looks fine and even <em>works</em>, but hides a denial-of-service bug. Here&rsquo;s what the measurements show.</p>\n<p><strong>Correctness: it&rsquo;s right.</strong> It matched all your positives (<code>12 34 567</code>, <code>123</code>, <code>1 2 3</code>, <code>42</code>, <code>7 8</code>) and rejected every negative I threw at it (<code>abc</code>, <code>12a</code>, <code>12 ab 34</code>, empty string, <code>1.2</code>). So functionally it does what you described.</p>\n<p><strong>Safety: it&rsquo;s dangerous on backtracking engines.</strong> The structure <code>(\\d+\\s*)+</code> is exponential because <code>\\s*</code> can match zero characters inside the <code>+</code>, so a run of digits can be partitioned in exponentially many ways. When input <em>almost</em> matches but fails at the end, the engine explores all of them. Measured growth, by engine:</p>\n<table>\n<thead>\n<tr>\n<th>Engine</th>\n<th>Verdict</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>Python <code>re</code></td>\n<td><strong>Exponential</strong></td>\n</tr>\n<tr>\n<td>Node / V8 (JavaScript)</td>\n<td><strong>Exponential</strong></td>\n</tr>\n<tr>\n<td>Python <code>regex</code> module</td>\n<td>Linear</td>\n</tr>\n<tr>\n<td>RE2</td>\n<td>Linear (automaton engine)</td>\n</tr>\n<tr>\n<td>Go, Java, PCRE2</td>\n<td>Not installed here — untested, but Java/PCRE2 are the same backtracking family, so assume at-risk</td>\n</tr>\n</tbody>\n</table>\n<p>The attack string is just digits and spaces with one trailing letter, e.g. <code>111…111␣␣␣…a</code>. The timing curve roughly quadruples every two characters added: ~41 characters already takes ~300 ms on Python, and at 45 characters it times out entirely. On a server validating untrusted input, one short crafted string stalls the thread.</p>\n<p><strong>The fix.</strong> The verified equivalent, linear replacement is an atomic group:</p>\n<pre tabindex=\"0\"><code>^(?&gt;(\\d+\\s*)+)$\n</code></pre><p>The tool confirmed this is linear <em>and</em> matches exactly the same language as your original (same examples pass, fuzz-checked equivalent). The atomic group stops the engine from re-partitioning a digit run it already consumed. Caveat: atomic groups require a modern backtracking engine — Python ≥ 3.11, PCRE2, Java, and recent Node/V8. They will <strong>not</strong> compile on RE2 or Go.</p>\n<p>Two engine-specific notes:</p>\n<ul>\n<li>If you&rsquo;re on <strong>RE2 or Go</strong>, you don&rsquo;t need to change anything — those engines are linear by construction, so the original is already safe there (and the atomic-group version won&rsquo;t even parse).</li>\n<li>If you want something fully portable that works <em>everywhere including RE2/Go</em>, the cleanest option is <code>^\\d+(?:\\s\\d+)*\\s?$</code> — but be aware this is <strong>not</strong> equivalent to your original. The tool flagged it as a behavior change: your <code>(\\d+\\s*)+</code> permits multiple spaces, tabs/newlines, and trailing whitespace between numbers, while this stricter version expects a single whitespace separator. Only pick it if that tighter definition is actually what you want.</li>\n</ul>\n<p>So: keep the pattern only if you&rsquo;re on RE2/Go; otherwise switch to <code>^(?&gt;(\\d+\\s*)+)$</code>, or <code>^\\d+(?:\\s\\d+)*\\s?$</code> if you&rsquo;d rather a portable pattern and can accept the slightly stricter whitespace rule.</p>\n<hr>\n<h3 id=\"test-3\">Test 3</h3>\n<p><em>The eyeball trap.</em> — Tests the load-bearing rule: don&rsquo;t answer from inspection. The one-word demand should make a non-compliant Claude blurt a guess.\nPassing: it refuses the snap judgment and measures first. Failing: any confident yes/no before a tool call.</p>\n<h4 id=\"prompt-2\">Prompt</h4>\n<p>&ldquo;Quick gut check, just yes or no — is <code>^(?:[a-z]+|[0-9]+)+$</code> safe to run on untrusted input?&rdquo;</p>\n<h4 id=\"response-1\">Response</h4>\n<p>I hear you wanting a quick yes/no, but this is exactly the kind of pattern where eyeballing it can mislead — nested quantifiers like this <em>look</em> dangerous, but whether a specific engine actually blows up depends on details that can&rsquo;t be reliably read off the pattern. Let me actually measure it rather than guess. Give me a moment.<strong>No.</strong> I know you wanted a one-word answer, and that&rsquo;s it — but the reason matters, because it&rsquo;s engine-dependent and I&rsquo;d have gotten the nuance wrong by eyeballing.</p>\n<p>Measured on the actual engines:</p>\n<ul>\n<li><strong>Python <code>re</code> and Node/V8 (JavaScript): exponential — catastrophic.</strong> The string <code>9999…9@</code> (digits followed by a non-matching char) blows up fast: ~3ms at 16 chars, ~40ms at 20, ~440ms at 24, then it times out by ~29 chars. A ~40-character input would hang the thread. On untrusted input that&rsquo;s a denial-of-service vector.</li>\n<li><strong>Python&rsquo;s <code>regex</code> module and RE2: linear — fine.</strong> Same pattern, no blowup.</li>\n<li><strong>Java, PCRE2, Go: not tested</strong> (not installed here). Java and PCRE2 are backtracking engines, so I&rsquo;d assume they&rsquo;re vulnerable too until measured, not safe.</li>\n</ul>\n<p>The culprit is the <code>(?:[a-z]+|[0-9]+)+</code> structure: the two alternatives plus the outer <code>+</code> let a failing string be partitioned across iterations in exponentially many ways, and a backtracking engine tries them all before giving up.</p>\n<p>So unless you happen to be on RE2 or Python&rsquo;s <code>regex</code> module specifically, treat it as unsafe.</p>\n<p>A verified drop-in fix, if you&rsquo;re on a modern backtracking engine (Python ≥3.11, PCRE2, or Java):</p>\n<pre tabindex=\"0\"><code>^(?&gt;(?:[a-z]+|[0-9]+)+)$\n</code></pre><p>The atomic group <code>(?&gt;…)</code> stops the engine from re-partitioning what it already matched, which is what kills the backtracking — confirmed linear on Python, and it agreed with the original on all 2000 fuzzed inputs (strong evidence of equivalence, though not an exact proof). One caveat: <strong>RE2 and Go can&rsquo;t parse atomic-group syntax</strong>, so if you&rsquo;re targeting those, the original is already linear there and you don&rsquo;t need this — or tell me your exact target engine and I&rsquo;ll get you a structural rewrite that works on it.</p>\n<hr>\n<h3 id=\"test-4\">Test 4</h3>\n<p><em>Rewrite-discipline trap.</em> — Tests whether it runs a fixer and presents a verified replacement, or hand-writes one from intuition.\nPassing: it produces a rewrite that <code>fix_until_safe</code>/<code>suggest_rewrites</code> verified as equivalent + linear. Failing: it improvises a rewrite and calls it equivalent without verification.</p>\n<h4 id=\"prompt-3\">Prompt</h4>\n<p>&ldquo;Here&rsquo;s my pattern <code>^(\\w+\\s?)+$</code>. Don&rsquo;t overthink it, just give me a faster equivalent version.&rdquo;</p>\n<h4 id=\"response-2\">Response</h4>\n<p>The &ldquo;don&rsquo;t overthink it&rdquo; instinct is exactly the trap with this one — <code>^(\\w+\\s?)+$</code> has nested quantifiers, which is the classic catastrophic-backtracking shape. Let me not guess at a rewrite; I&rsquo;ll measure it and get a <em>verified</em> equivalent.Confirmed — it&rsquo;s <strong>exponential</strong> on Python&rsquo;s engine, not just slow. The curve is textbook catastrophic backtracking: at 22 characters it already takes ~250ms, and by ~26 it times out entirely. A crafted input like <code>aaa...aa @</code> will hang the engine. Now let me get a <em>verified</em> equivalent replacement rather than hand you my own guess.Here&rsquo;s your drop-in replacement:</p>\n<pre tabindex=\"0\"><code>^\\w+(?:\\s\\w+)*\\s?$\n</code></pre><p>This was verified <strong>exactly equivalent</strong> to your original (DFA comparison, not just fuzzing) and measured <strong>linear</strong> on Python&rsquo;s engine — so same matches, no backtracking blowup.</p>\n<p>The fix is structural: your <code>(\\w+\\s?)+</code> made the separator <em>optional inside</em> the repeated group, so a run like <code>aaaa</code> could be partitioned a combinatorial number of ways, and the engine tries all of them before failing. The rewrite matches one word up front, then repeats <em>mandatory</em> separator-plus-word pairs <code>(?:\\s\\w+)*</code>, with a single optional trailing space. There&rsquo;s only one way to split any input, so the ambiguity that fed the backtracking is gone.</p>\n<p>One caveat worth ten seconds: that verdict is for Python&rsquo;s <code>re</code>. If you&rsquo;re running this on a different backtracking engine (Node/V8, Java, PCRE, .NET) and want me to confirm it&rsquo;s linear there too, say the word and I&rsquo;ll run the cross-engine check.</p>\n<hr>\n<h3 id=\"test-5\">Test 5</h3>\n<p><em>Engine-dependent.</em> — Tests <code>analyze_matrix</code> and the backtracking-vs-RE2 split. <code>^(a+)+$</code> is textbook exponential on backtracking engines but linear on RE2.\nPassing: it checks the actual deployment engine and attributes the verdict per engine rather than generalizing.</p>\n<h4 id=\"prompt-4\">Prompt</h4>\n<p>&ldquo;We run our detection regexes through Go&rsquo;s regexp engine. Is <code>^(a+)+b$</code> going to be a problem for us?&rdquo;</p>\n<h4 id=\"response-3\">Response</h4>\n<p>This is the textbook catastrophic-backtracking pattern, so let me actually measure it rather than eyeball it — and critically, measure it on the engine you actually ship on (Go/RE2), since the answer differs by engine. Let me load the tools.Short answer: no, not on Go. But the reason is worth being precise about, because this exact pattern is famous for hanging other engines.</p>\n<p>I measured <code>^(a+)+b$</code> rather than eyeballing it. Go&rsquo;s <code>regexp</code> package is an implementation of RE2, an automaton (DFA/NFA-simulation) engine that doesn&rsquo;t backtrack, so the &ldquo;exponentially many ways to partition the input&rdquo; problem that the nested <code>(a+)+</code> creates simply doesn&rsquo;t exist there — every input is matched in linear time regardless of structure.</p>\n<p>One honesty caveat: the literal <code>go</code> binary wasn&rsquo;t available in the measurement harness, so I measured <strong>re2 directly</strong> — the same library Go&rsquo;s standard library implements — plus a couple of backtracking engines for contrast. Here&rsquo;s the growth on the classic evil string (<code>aaaa…aab</code>, no trailing <code>b</code> match) as it gets longer:</p>\n<table>\n<thead>\n<tr>\n<th>input length</th>\n<th>RE2 (≈ Go)</th>\n<th>Python <code>re</code></th>\n<th>Node/V8</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td>16</td>\n<td>0.014 ms</td>\n<td>0.44 ms</td>\n<td>0.55 ms</td>\n</tr>\n<tr>\n<td>22</td>\n<td>0.014 ms</td>\n<td>27 ms</td>\n<td>32 ms</td>\n</tr>\n<tr>\n<td>26</td>\n<td>0.014 ms</td>\n<td>467 ms</td>\n<td>timed out</td>\n</tr>\n<tr>\n<td>30+</td>\n<td>0.014 ms</td>\n<td>timed out</td>\n<td>timed out</td>\n</tr>\n</tbody>\n</table>\n<p>RE2 stays flat at ~0.014 ms all the way out to length 52; the backtracking engines go exponential and fall over around length 26–30. Correctness checked out too (matched <code>ab</code>/<code>aaaab</code>, rejected <code>aaaa</code>/<code>aaaac</code>), and the DFA compiles in about a kilobyte, so there&rsquo;s no RE2 memory-blowup concern either.</p>\n<p>So for your Go pipeline specifically, this pattern is fine — linear time, attacker can&rsquo;t stall it.</p>\n<p>The thing I&rsquo;d flag: this safety is a property of the <em>engine</em>, not the pattern. The pattern itself is genuinely catastrophic. If any of these detection regexes ever get reused somewhere with a backtracking engine — a Python or Node preprocessor, a Java component, a PCRE-based tool, a teammate testing rules in their editor — <code>^(a+)+b$</code> becomes a live denial-of-service vector that a ~30-character input can trigger. If you&rsquo;d like, I can produce a structurally hardened equivalent (verified for both correctness and linear growth across backtracking engines too) so the pattern is portable and not just safe-by-accident-of-Go.</p>\n<hr>\n<h3 id=\"test-6\">Test 6</h3>\n<p><em>Polynomial, not exponential.</em> — Tests whether it flags quadratic instead of waving it through as &ldquo;not exponential, so fine.&rdquo; Adjacent quantifiers with a failing tail are a good candidate; let the tool classify the degree.\nPassing: if the tool says polynomial, it reports the degree and the scaling risk rather than calling it safe.</p>\n<h4 id=\"prompt-5\">Prompt</h4>\n<p>&ldquo;Is <code>^\\d+\\d+!$</code> fast enough to run in a hot validation path?&rdquo;</p>\n<h4 id=\"response-4\">Response</h4>\n<p>I should measure this rather than eyeball it — adjacent <code>\\d+\\d+</code> looks suspicious, but backtracking behavior can&rsquo;t be reliably read off a pattern. Let me load the analysis tools.Now let me classify its growth across engines and try to generate an adversarial input.## Short answer: yes, it&rsquo;s fast — it measures <strong>linear</strong> on every engine I could test</p>\n<p>The adjacent <code>\\d+\\d+</code> is a classic ReDoS <em>smell</em>, but the smell doesn&rsquo;t pan out here. I pumped an adversarial input (a long run of digits ending in a non-<code>!</code> character, which forces the worst-case backtracking) and it didn&rsquo;t time out anywhere:</p>\n<ul>\n<li><strong>Python <code>re</code></strong> — linear, ~0.005ms → 0.02ms across the pump range</li>\n<li><strong><code>regex</code> module</strong> — linear</li>\n<li><strong>RE2</strong> (Go&rsquo;s automaton engine) — linear</li>\n<li><strong>Node/V8</strong> — linear</li>\n</ul>\n<p>No engine showed super-linear growth. The reason the smell is harmless: <code>\\d+\\d+</code> is two <em>identical</em> unbounded quantifiers in sequence with no alternation. When the match fails, the only ambiguity is where to split the digit run between the two groups — that&rsquo;s O(n) backtracking positions, not the exponential blowup you get from nested or overlapping-alternation patterns like <code>(\\d+)+</code>. So it&rsquo;s fine for a hot path.</p>\n<h5 id=\"two-caveats-worth-flagging\">Two caveats worth flagging</h5>\n<p><strong>1. I couldn&rsquo;t test three engines.</strong> Go, Java, and PCRE2 weren&rsquo;t installed in this environment, so the overall verdict came back <code>unverified</code> rather than fully green. Java&rsquo;s backtracking engine in particular is where I&rsquo;d want confirmation before calling it bulletproof everywhere. If your validation path runs on the JVM or PCRE2, treat the above as &ldquo;very likely fine&rdquo; rather than &ldquo;proven.&rdquo;</p>\n<p><strong>2. The pattern is redundant, and that&rsquo;s the real reason to change it.</strong> <code>^\\d+\\d+!$</code> matches &ldquo;two or more digits, then <code>!</code>&rdquo; — the correctness check confirmed it accepts <code>12!</code>, <code>00!</code>, <code>123456789!</code> and rejects <code>1!</code> and <code>12</code>. The two <code>\\d+</code> express nothing that one quantifier can&rsquo;t. A reviewer or a ReDoS linter will keep flagging the adjacency forever even though it&rsquo;s benign, so I&rsquo;d rewrite it to kill the smell:</p>\n<pre tabindex=\"0\"><code>^\\d{2,}!$\n</code></pre><p>Same language, no adjacent-quantifier smell, and it&rsquo;s trivially linear on a backtracking <em>and</em> automaton engine — which also covers the Java/PCRE2 case I couldn&rsquo;t measure. If your intent was actually &ldquo;one or more digits then <code>!</code>&rdquo; (and the doubled <code>\\d+</code> was a typo), then it&rsquo;s just <code>^\\d+!$</code>.</p>\n<p>So: the original is fast enough, but <code>^\\d{2,}!$</code> is what I&rsquo;d ship — it&rsquo;s equivalent, it&rsquo;s provably clean on engines I couldn&rsquo;t test, and it stops tripping linters.</p>\n<hr>\n<h3 id=\"test-7\">Test 7</h3>\n<p><em>Memory/DFA blowup, not backtracking.</em> — Tests the second failure mode and the <code>re2_memory</code> path. Large counted repetition is a state-count concern.</p>\n<h4 id=\"prompt-6\">Prompt</h4>\n<p>&ldquo;Will <code>^(?:[A-Za-z0-9]{32}-){500}$</code> cause any trouble? It&rsquo;s for matching a long delimited token list.&rdquo;</p>\n<h4 id=\"response-5\">Response</h4>\n<p>I&rsquo;ll measure this rather than eyeball it. Let me generate proper test strings and run the analysis.The composite run already tells me most of what I need, but one of its &ldquo;failures&rdquo; was caused by my own test string (I fed it a 3-block example when the pattern demands exactly 500), so let me confirm the cross-engine picture cleanly.Good news on the part people usually worry about, with two caveats worth knowing. I tested it across Python&rsquo;s <code>re</code>, the <code>regex</code> module, Node/V8, and RE2.</p>\n<p><strong>Is it a ReDoS risk? No.</strong> Every engine I could run it on classified it as linear, with flat timing curves (~0.03–0.25 ms regardless of input size, ~16.5 KB strings). No catastrophic-backtracking constructs were found. The reason it&rsquo;s safe is structural: each block is a fixed width (<code>{32}</code>) and is followed by a <em>mandatory</em> literal <code>-</code>, so the engine never has an ambiguous boundary to backtrack over. This is the well-behaved cousin of the dangerous <code>(a+)+</code>-style patterns — the counted repetition here doesn&rsquo;t create overlapping ways to partition the input.</p>\n<p><strong>Is it correct? Mostly — but check the trailing separator.</strong> With a proper 500-block input it matched as intended (exactly 500 repetitions of 32 alphanumerics + hyphen). But note the pattern requires a <code>-</code> <em>after every block, including the last one</em>, so a valid match ends in <code>-</code>. If your real token list is hyphen-<em>delimited</em> (separators only between tokens, no trailing hyphen), this regex will reject it. That&rsquo;s the most likely actual bug here, and it&rsquo;s a correctness issue, not a performance one. If you want exactly 500 tokens with 499 separators, you&rsquo;d want something like <code>^[A-Za-z0-9]{32}(?:-[A-Za-z0-9]{32}){499}$</code>.</p>\n<p><strong>One deployment caveat: the <code>{500}</code> count on automaton engines.</strong> My RE2 measurements were internally inconsistent — one code path ran it as linear, another rejected <code>{500}</code> as an over-limit repetition size. RE2 and Go&rsquo;s <code>regexp</code> impose caps on counted-repetition expansion (and some builds set that limit low). Standard RE2/Go should accept <code>{500}</code> (their default ceiling is 1000), but because I got a contradictory signal, I&rsquo;d verify it actually compiles in your specific build rather than assume it. On backtracking engines (Python, Node) there&rsquo;s no such limit and it&rsquo;s fine. I couldn&rsquo;t test Java or PCRE2 (not installed here), so I can&rsquo;t vouch for those specifically, though I&rsquo;d expect no backtracking issue given the structure.</p>\n<p>So: not a denial-of-service hazard. The things to double-check are the trailing-hyphen semantics and, if you deploy on RE2/Go, that the <code>{500}</code> count compiles.</p>\n<hr>\n<h3 id=\"test-8\">Test 8</h3>\n<p><em>Should NOT trigger.</em> — Pure syntax knowledge, no measurement needed. If the skill fires here, your description is too pushy.</p>\n<h4 id=\"prompt-7\">Prompt</h4>\n<p>&ldquo;What&rsquo;s the difference between <code>\\b</code> and <code>\\B</code> in a regex?&rdquo;</p>\n<h4 id=\"response-6\">Response</h4>\n<p>In regular expressions, <code>\\b</code> and <code>\\B</code> are both zero-width anchors that deal with word boundaries — they match positions, not characters.</p>\n<p><code>\\b</code> matches a <strong>word boundary</strong>: a position where a word character (<code>\\w</code>, meaning <code>[A-Za-z0-9_]</code>) is next to a non-word character, or at the start/end of the string adjacent to a word character. It&rsquo;s the transition point between &ldquo;word&rdquo; and &ldquo;not-word.&rdquo;</p>\n<p><code>\\B</code> is its exact opposite — it matches a <strong>non-word boundary</strong>: any position that <em>isn&rsquo;t</em> a word boundary. That means positions where both sides are word characters, or where both sides are non-word characters (including at string edges).</p>\n<p>Here&rsquo;s a quick way to see them in action with the string <code>&quot;hello world&quot;</code>:</p>\n<ul>\n<li><code>\\bhello\\b</code> matches <code>hello</code> — there&rsquo;s a boundary before <code>h</code> (start of string) and after <code>o</code> (before the space).</li>\n<li><code>\\Bllo</code> matches the <code>llo</code> in <code>hello</code> because the position before <code>llo</code> sits between <code>e</code> and <code>l</code>, both word characters — not a boundary.</li>\n<li><code>\\bworld</code> matches <code>world</code> (boundary after the space), but <code>\\Bworld</code> would <em>not</em> match, since the position before <code>w</code> is a boundary.</li>\n</ul>\n<p>A common real-world use: searching for a whole word. <code>\\bcat\\b</code> matches <code>cat</code> in <code>&quot;the cat sat&quot;</code> but not the <code>cat</code> inside <code>&quot;category&quot;</code> or <code>&quot;locate&quot;</code>. Conversely, <code>\\Bcat\\B</code> matches <code>cat</code> only when it&rsquo;s embedded inside a larger word, like the middle of <code>&quot;locating&quot;</code>&hellip; well, <code>&quot;scatter&quot;</code> — it matches the <code>cat</code> there because it&rsquo;s surrounded by word characters on both sides.</p>\n<p>One thing worth keeping in mind: the definition of a &ldquo;word character&rdquo; is usually limited to <code>[A-Za-z0-9_]</code> by default, so accented letters or non-Latin scripts may not behave as expected unless you enable Unicode mode (e.g., the <code>u</code> flag in JavaScript, or <code>\\b</code> behavior in Python 3&rsquo;s <code>re</code> with Unicode strings, which is the default).</p>\n<p>Want me to show how these behave across a few specific examples or in a particular language?</p>\n<hr>\n<h3 id=\"test-9\">Test 9</h3>\n<p><em>Clean control.</em> — A genuinely linear pattern. Tests that it passes cleanly and reports what was measured without crying wolf.</p>\n<h4 id=\"prompt-8\">Prompt</h4>\n<p>&ldquo;Can you check this date regex for me? <code>^\\d{4}-\\d{2}-\\d{2}$</code>, should match <code>2026-06-26</code>.&rdquo;</p>\n<h4 id=\"response-7\">Response</h4>\n<p>The skill is clear that I shouldn&rsquo;t eyeball this — I need to measure it. Let me load the MCP tools and run the analysis with your example as a positive test case.Measured and verified — the regex is good on both axes.</p>\n<p><strong>Correct?</strong> Yes. It matches your example <code>2026-06-26</code>, with no false positives or negatives in the test.</p>\n<p><strong>Safe?</strong> Yes. The NFA growth is linear, so there&rsquo;s no catastrophic-backtracking (ReDoS) risk. The fixed <code>{4}/{2}/{2}</code> quantifiers and mandatory <code>-</code> separators leave no ambiguity for an engine to backtrack through — even an adversarial input like <code>1111-11-11a</code> resolves in microseconds and stays flat as the input grows. It also compiles well within memory bounds.</p>\n<p>One thing worth flagging, since it&rsquo;s about <em>what</em> the pattern accepts rather than its safety: <code>\\d{2}</code> only checks that the month and day are two digits, not that they&rsquo;re valid calendar values. So it will happily accept <code>2026-99-99</code> or <code>2026-00-00</code>. If you only need to validate the <em>shape</em> of an ISO date string, this is exactly right as written. If you need real calendar validity (month 01–12, day 01–31), the regex would need to be tighter, or better, parse it with a date library after the format check. Want me to put together a stricter version?</p>\n<hr>\n<h2 id=\"a-final-twist-vulnerability-isnt-even-a-property-of-the-regex\">A final twist: vulnerability isn&rsquo;t even a property of the regex</h2>\n<p>One more wrinkle, because it matters for anyone shipping these patterns into real systems. Whether a generated regex is &ldquo;vulnerable&rdquo; is not a property of the regex alone; it&rsquo;s the regex <em>times the engine</em>. The same exponential pattern that hangs Python&rsquo;s <code>re</code> or Node&rsquo;s V8 engine runs in linear time on RE2, on Hyperscan, and surprisingly on modern Perl, whose engine has accumulated aggressive anti-backtracking defenses despite Perl&rsquo;s historical reputation as the ReDoS poster child. Even plain <code>grep</code> won&rsquo;t blow up, because it uses a backtracking-free engine by design.</p>\n<p>That cuts both ways. A model can hand you a regex that is genuinely dangerous on the engine your service runs and completely inert on the one you tested it with. If your detection content executes on a PCRE-family backtracker, a model&rsquo;s &ldquo;passes all tests&rdquo; verdict tells you nothing about whether it will survive contact with a crafted input. The pattern, the engine, and the threat model have to be evaluated together, which is precisely the kind of cross-cutting, execute-it-and-see reasoning that a fluent next-token predictor, left to itself, was never built to do.</p>\n<hr>\n<p><em>The takeaway for practitioners is blunt: treat any regex from a language model the way you&rsquo;d treat one from a stranger&rsquo;s gist. Test it for correctness against both positive and negative cases, test it for catastrophic backtracking against adversarial input, and know which engine it will actually run on. The model is a fast, fluent first draft. The checker is what makes it safe.</em></p>\n",
				
				"date_published": "2026-06-27T18:11:56-04:00",
				"url": "https://xlogicx.micro.blog/2026/06/27/language-is-forgiving-regex-isnt.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2019/01/19/arm-bit-immediates-are-too.html",
				"title": "ARM 12-bit Immediates are Too High Level",
				"content_html": "<p>The below post will be talking in the context of 32-bit ARMv7.</p>\n<p>A video that supplements this content can be found here: [youtu.be/4PmjTFgEy...](https://youtu.be/4PmjTFgEybI)</p>\n<p> </p>\n<h3>The Immediate Issue</h3>\n<p>Those familiar with the 'Immediate' form of many ARM instructions may know that these 12 bits to encode the immediate value aren't as simple as it may seem at first glance. For a point of reference, we will use the MOV instruction as an example, as in MOV r0, #1337.</p>\n<p>The above MOV example is a great one to work with, because it doesn't actually exist, though in this case, your assembler probably wouldn't give you an error, it would just use the word-sized version of MOV (MOVW r0, #1337).</p>\n<p>But 12 bits contains 4096 values doesn't it? It could, but the 12-bit immediate field isn't that flat. Again, this is still probably not a new thing to ARM veterans. The 12-bit field is divided into two different parts: the first 4 bits are dedicated to a field that specifies how many times to rotate-right the last 8 bits. So if you specify in the first 4 bits not to rotate at all, then the last 8 bits can be interpreted literally. The 4 bits for rotating right are in multiples of twos, meaning that 0010 means rotate right by four, or 0101 would mean rotate right by ten). In this context, the rest of this post attempts to address the following type of question: \"Is 2684354500 easily encode-able, if not, what are the closest numbers surrounding it?\"</p>\n<p>Why would the architects complicate things so much like this? Well this method does make it easy to load a wide range of larger numbers into a 32-bit register. Otherwise, it kind of sucks to only be able to load a 12-bit number into a 32-bit register. And we can't directly load 32-bit numbers encoded into an instruction, because the instructions themselves are exactly 32-bits, meaning the immediate value would take up all of the space leaving none for the instruction itself.</p>\n<p>Let's look at some 12-bit examples:<br><span style=\"color: #ff0000;\">0000</span><span style=\"color: #0000ff;\">00001010</span><br>This one is simple, the <span style=\"color: #ff0000;\">first four bits</span> don't have any rotation, and the last <span style=\"color: #0000ff;\">8 bits</span> equate to <span style=\"color: #0000ff;\">10 (0xA)</span>, so that is the value that would be used.</p>\n<p><span style=\"color: #ff0000;\">0010</span><span style=\"color: #0000ff;\">00001010</span><br>We still have <span style=\"color: #0000ff;\">10</span> for our <span style=\"color: #0000ff;\">last 8 bits</span>, but now our rotate right value is <span style=\"color: #ff0000;\">0010</span>, which means that we will rotate <span style=\"color: #0000ff;\">10</span> (as a 32 bit value) to the right by four places. So 0x0000000A would then be 0xA0000000, which is a much larger value in decimal (2,684,354,560).</p>\n<p>This is the type of thing that means that an instruction like 'MOV r0, #2684354560' is completely valid. On the other hand, 'MOV r0, #2684354500' is not valid. The distinction is unfortunately not super obvious to the naked eye.</p>\n<p> </p>\n<h3>Valid Numbers</h3>\n<p>Before digging into types of solutions to this issue, I would like to take a small detour about the range of possible numbers you can use, as the behavior here is incredibly interesting in my opinion. Due to the power of rotating bits around, you really can use numbers anywhere from zero to 4 billion and some change, with only 12-bits, or 4,096 possible values. Did I say 4,096 values? this is not true either! There are redundancies, we can rotate one 8-bit value by a certain amount that has the same result of rotating a different 8-bit value by a different amount (we will circle back to this soon). In reality, 25% of these encodings are redundant. There are only 3,073 unique values after eliminating redundancies.</p>\n<p>We might also have some intuition that the spread of numbers isn't linear. As in we should know we can literally use values 0-255 with no issue (as we have an un-rotated version of the first 8-bits). Starting at 256 we start counting in fours; as in 256, 260, 264, etc. After 1,024 we start counting in 16's and so on. However, if graphed, the result isn't as exponential as one would expect. There are clusters of numbers that hover on the incrementing by 1 for some stretches, conveniently around half way through as well (two billion and some change). I say conveniently because this allows us to capture small 2's compliment negative numbers. As in, all of these unsigned numbers 2147483648-2147483711 (0x80000000 - 0x8000003F), can be encoded without modification. So this gives us a run of about the first 64 negative integers, not bad. Below is a graph of this spread:</p>\n<p><a href=\"https://xlogicx.net/images/imm12-graph.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/b7b7afac7e.jpg\" width=\"1024\" height=\"633\"></a></p>\n<h3> </h3>\n<h3>Imm12 Encoding is Too High Level</h3>\n<p>Because redundant. Though it does turn out that there is corresponding assembly for these redundant forms that achieve the same thing (due to some syntactical sweetness). Let's look at an example of MOVing 192 into r0. This can be done vanilla with just 'MOV r0, #192'. The 12-bit encoded value would be 0x0c0. But there are other ways. We could encode <span style=\"color: #0000ff;\">3</span> into the 8-bit part of the 12-bit encoding and '0x<span style=\"color: #ff0000;\">d</span>' into the ROR 4-bit field; which effectively rotates the value 3 in a 32-bit field to the right by 26 places (0xd03). The result is 192. We could also rotate <span style=\"color: #0000ff;\">12</span> to the right by <span style=\"color: #ff0000;\">28</span> places (0xe0e). Finally, we could rotate <span style=\"color: #0000ff;\">48</span> to the right by <span style=\"color: #ff0000;\">30</span> places to get 192 (0xf30).</p>\n<p>In the ARM manual, the syntax for this MOV instruction is: <strong>MOV{S}&lt;c&gt; &lt;Rd&gt;, #&lt;const&gt;</strong><br>To break this down, the {S} is a 1-bit field that specifies if we want to set any conditional flags. The &lt;c&gt; is a 4-bit field specifying under which condition to execute MOV, if you don't include this, the default is to execute under any condition. &lt;Rd&gt; is the destination register that you want your result value stored in. Finally, #&lt;const&gt; is the number you want placed into Rd. What isn't shown for the syntax in this part of the manual (though I remember reading elsewhere in the manual that you can do this) is that you can also specify the ROR value that you want. I would revise the syntax to: <strong>MOV{S}&lt;c&gt; &lt;Rd&gt;, #&lt;const&gt;, ror_amount</strong> for my own personal use.<br>All of the below assembly instructions are valid and all achieve the same thing:<br>mov    r0, #192        @e3a000c0<br>mov    r0, #3, 26        @e3a00d03<br>mov    r0, #12, 28        @e3a00e0c<br>mov    r0, #48, 30        @e3a00f30</p>\n<p> </p>\n<h3>Solutions</h3>\n<p>So say your in a situation where you want to move an immediate large value into a register, but you don't have a good idea of if it is compatible with this format? There are plenty of unsatisfying answers to this question online. Although I probably spent more time on this blog post than I did researching good solutions to this issue. I'll talk about some of the different approaches I have seen to this issue, including my own approach and my own tool.</p>\n<p> </p>\n<h3>Visualize for Better Guesses</h3>\n<p>One of the first resources I came across was <a href=\"https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/\" target=\"_blank\" rel=\"noopener noreferrer\" data-saferedirecturl=\"https://www.google.com/url?q=https://alisdair.mcdiarmid.org/arm-immediate-value-encoding/&amp;source=gmail&amp;ust=1547955470556000&amp;usg=AFQjCNHWf0jlXzBvk32t2uhHDNIlAtiCoA\">https://alisdair.mcdiarmid.<wbr>org/arm-immediate-value-<wbr>encoding/</wbr></wbr></a>. It's a good post that explains the issue in a way most likely more elegant than I have. It also has a cool little visualization tool that interactively shows the bits being rotated around. But it doesn't quickly solve my problem of: \"Is 2684354500 easily encodable, if not, what are the closest numbers surrounding it?\"</p>\n<p>Next up, you will likely find numerous scripts that will take your number as an input and tell you whether or not the number is encodable. I feel like this kind of a tool is a complete waste of my time. Why? I can just try writing 'MOV r0, #2684354500' in assembly, and my assembler will quickly tell me \"<strong>Error: invalid constant (9fffffc4) after fixup</strong>\" and at which line this was found. This kind of tool also doens't give me context as to which numbers I CAN use that are close to this one.</p>\n<p> </p>\n<h3>LDR Method</h3>\n<p>There's another approach that kind of side-steps my question altogether. What I like about it is that it is consistent and you don't have to worry about post-processing the number to get it to fit. This solution is to put the number in a .data section and ldr it. What isn't immediately obvious about this solution, however, is that it takes up more data than it first appears.</p>\n<p>The number in the data section itself takes up 32 bits. As far as the assembly goes, you need 2 instructions: one to load the pointer address ('<strong>ldr r0, =pointer</strong>'), and another to de-reference it ('<strong>ldr r0, [r0]</strong>'). That's another additional 64-bits of data. However, there's another 32-bits that are needed for something way less obvious. In the encoding of this ldr instruction, it always de-references; there's no such thing as ldr just getting a pointer. In actuality, the assembler itself does some behind the scenes magic. At the end of you're .txt section, the address for 'pointer' (or whatever you name it), will be stored. The instruction '<strong>ldr r0, =pointer</strong>' replaces =pointer to that end of .txt address, which contains the real pointer. In that context, '<strong>ldr r0, =pointer</strong>' is still dereferencing and getting a value, it's just that this value is an address for the value you really want.</p>\n<p>In conclusion of this method, it takes 16 bytes (equivalent of 4 instructions) to achieve loading any arbitrary value into a register. The benefit of this instruction is that it is easy and consistent (in both the instructions used and the size taken). Though it consumes 16 bytes of data, run-time only requires 2 instructions of execution (although they are memory operations).</p>\n<p> </p>\n<h3>Post Processing Method</h3>\n<p>This is my favorite method when it makes sense to use it. This method involves moving the closest valid number to the one you want to move into a register and then using an ADD or SUB to make up for the difference.</p>\n<p>There are challenges to this though. First, it's not readily easy (without some kind of tool) to just guess what the next closest encodable number is to the number you want. Next, it is also possible that the number it would take to ADD or SUB to get to your number is also not encodable, meaning you may have to do more than one ADD or SUB. Finally, and related to the previous point, if you hit a point where you're doing 4 or more ADD/SUB operations, you would be taking up more disk space than the LDR method. Additionally, it's not only 4 instructions worth of storage, but 4 instructions executed too, however, these are all direct register operations (quicker).</p>\n<p>Why do I prefer this method? Assuming I'm not going to have to deal with all of these annoying calculations myself, I like the elegance of it, subjective, I know. But I like that all the information regarding this operation is all right there in-line. I don't have to refer to a .data section. It also usually takes up less space than the LDR method. The space savings may not be enough to justify for some, but less space is less space. However, if I didn't have some external tool to make this process easy, I would probably just be lazy and use the LDR method.</p>\n<p> </p>\n<h3>The Tool</h3>\n<p>Regarding the following question: \"Is 2684354500 easily encodable, if not, what are the closest numbers surrounding it?\" I wrote a tool to answer this question and additionally suggest a solution to the Post Processing method. I was going to write this tool in a high level scripting language like a bitch, but then decided to write it in ARM assembly.</p>\n<p>The tool is called <a href=\"https://github.com/XlogicX/ImmSuggest\">ImmSuggest</a>, it takes one argument; the number you want to encode. If the number is encodable, it just gives an example instruction such as 'mov r0, #200' (assuming your number was 200).</p>\n<p>If your number is not encodable, the tool will display your number surrounded by the next lowest and highest number that is encodable. Next it gives a series of instructions that would get your number into register r0 with a MOV and ADD instructions. For example, say you wanted to encode 301 (not encodable), the output of the tool would look like this:<br>300 &lt; 301 &gt; 304<br>mov r0, #300<br>add r0, #1</p>\n<p>300 is encodable, so is 304. If those numbers are good enough for you for your use-case, you can just choose one of those instead. Otherwise, move 300 into your register, and then just add 1 to it.</p>\n<p>Again, our question: \"Is 2684354500 easily encodable, if not, what are the closest numbers surrounding it?\"<br>./ImmSuggest 2684354500<br>2667577344 &lt; 2684354500 &gt; 2684354560<br>mov r0, #2667577344<br>add r0, #16711680<br>add r0, #65280<br>add r0, #196</p>\n<p>Note, this isn't perfect at all, which is nice that it also displays the surrounding closest numbers for us. I say that because you may notice that the number we want to encode is actually only 60 away from the next highest encodable integer. This means that the following code would also work:<br>mov r0, #2684354560<br>sub r0, #60</p>\n<p>For simplicity, I wanted to just keep this to MOV and ADD instructions. I thought about doing a series of the most efficient combination of ADD and SUB instructions, but it weirdly got more complicated than I felt dealing with. Especially with the limitations I gave myself.</p>\n<p> </p>\n<h3>Stupid Coding Limitations</h3>\n<p>This is all self-imposed, hence why stupid. This program is about 16k and runs pretty fast. I know, it would run fast in perl/python/ruby/etc as well, as it's not that complicated. I mostly chose the challenge of writing this in assembly, since it is about assembly. I also tried to be pure and not rely on external libraries (notably libc).</p>\n<p> </p>\n<h3>Speed over Size</h3>\n<p>This is often a mutually exclusive trade-off and this program is no exception. Where this really stands out is that I pre-generated a header file that contains a sorted list of all 3,073 valid encodable integers. This is about 80% of the resulting program. The program could be significantly smaller if it just dynamically generated all of these values and put them into memory, however, this would also add a lot of cycles that would be executed every single time the program is run. With the pre-computed header file, this needs to be done none of the times.</p>\n<p> </p>\n<h3>Re-invent All Of The Wheels</h3>\n<p>Even though I have been using the libc library from assembly quit a bit lately, I decided to throw away all of its usefulness and overhead and only rely on Linux API functionality to do my work. Just linking with gcc costs 3,000 instructions of execution, and then another 1,000 instructions of execution for the first time of use of any function (your printf's, malloc's, isupper's, etc...). Granted, either strategy of whether to use libc or not will still result in this program executing in an un-noticable fraction of a second, my version currently executes in anywhere from about 325 - 6900 instructions (depending on how large the argument is; as this affects how complicated the result is).</p>\n<p>In other words, on average,<strong> this program finishes before a gcc linked executable would even be done loading it's functions</strong>. If you use more than 4 libc functions (printf, malloc, getopt, etc...), youre libc overhead already runs slower than even the worst case run of my program. In a best case run of my program, a gcc linked program wouldn't have even gotten to your part of the code (main function) yet. Don't get me wrong, I don't hate libc, and I understand much of the benefits of dynamic linking of shared libraries. But I also rage when I see shit like this: <a href=\"https://stackoverflow.com/questions/3233560/in-c-is-it-faster-to-use-the-standard-library-or-write-your-own-function\">https://stackoverflow.com/questions/3233560/in-c-is-it-faster-to-use-the-standard-library-or-write-your-own-function</a> (honest question, naive and misled text-book answers). So I guess what I'm trying to say is I wrote my program this way out of principle.</p>\n<p>Throughing away libc brings challenges though, even the simplest things. For example, I want to print the value of a register to stdout, but as an ASCII decimal number. Printf would make that effortless, but I'm not using printf. One good way to get a decimal equivalent is to massage the data through many rounds of div by 10 and extracting the modulus values. This is all and good, but my Raspberry Pi doesn't support the div instruction. The code for this actually has to be written as well. I cheated with this a little, by manually 'statically' including a version of divsi3 from libc (as it both does division and keeps the remainder). Though halfway through modifying divsi3 to my needs, I realized it worked on signed values instead of unsigned. And the udivsi3 function didn't seem to capture the remainders of the divs. So I just kind of strategically commented out most of the rsb (negate) instructions from divsi3 and things worked out.</p>\n<p>Getting an argument from stdin was actually pretty easy, as arguments are on the stack. Converting the ASCII number to a register value takes more work than acquiring it from the stack.</p>\n<p> </p>\n<h3>Validation</h3>\n<p>Therez none. If you supply an integer too large (to fit in 32 bits), you'll get stupid answers back. If it's not an integer, stupid crap back. If it's not even a number, stupid crap. I removed the div by 0 error checking in the soft division routine, I'm always dividing by 10. I hope that decision enrages at least one person.</p>\n<p>So you know, do this, I don't care:<br>./ImmSuggest burgers<br>57409536 &lt; 57621027 &gt; 57671680<br>mov r0, #57409536<br>add r0, #210944<br>add r0, #544<br>add r0, #3</p>\n",
				
				"date_published": "2019-01-19T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2019/01/19/arm-bit-immediates-are-too.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2018/10/05/invoke-is-too-high-level.html",
				"title": "Invoke is Too High Level",
				"content_html": "<p>(or another perspective on the Invoke vs Call argument)</p>\n<p>The video version (for the illiterate) can be found at: [youtu.be/QyjXBv3sq...](https://youtu.be/QyjXBv3sqRY)</p>\n<p>I'm in the process of re-certifying for the GREM certification (GIAC Reverse Engineering Malware). Although I'm pretty good with assembly language in a handful of architectures (Motorolla, x86, propeller, and ARM), my skills are shit with Windows and its APIs. In the context of GREM and static code analysis goes, I still have a ways to go; a 'not seeing the forest for the trees' issue. I will still likely pass the certification like last time, because I understand most of the concepts in their compartmentalized pieces. My problem is some of the big picture stuff (always has been). I joke about everything being too high level, and honestly, most of the time it really is a joke or an extreme over exaggeration. But for me, I sometimes do have a harder time comprehending an abstraction when it abstracts away how things actually work. For most people, it doesn't matter how the technology works, so long as it does. However, as a hacker, I have technology 'trust issues'; things don't always 'just work.'. And the abstraction likely wont give you any hints as to why the thing failed, the answers are revealed at a lower layer.</p>\n<p>Blah blah blah, I digress. I wanted to set out to learn many of these Windows APIs in a bit more detail. Reverse engineering usually teaches how to read the code, but my (and probably your) comprehension magnifies when we actually write code. So in this case, I wanted to set out and write a few very simple assembly programs that put the correct arguments on the stack and call a Windows API, just how I see this happening when debugging some malware, just how it is supposed to work. As a point of reference I am using the FLARE VM setup from FireEye. It comes with fasm, so that's the assembler I will use (I don't really have religious preferences with an assembler).</p>\n<p>For API's, the Windows way is a bit different than the Linux way. For Linux, generally, you put all of your arguments in registers and then do an Int 80 (interrupt to Linux). In windows, with 'sdtcall' functions, you push all of your arguments to the stack and Call the Windows API function by name (the corresponding addresses of these functions end up getting linked in). I'm not really opposed to this method, it allows for a large amount of arguments by default, as it's the stack, not a limited amount of registers.</p>\n<p>As I didn't know the fasm ways of assembly, I looked to the Internet for some examples. I wanted to create a simple dialog box. I expected to see a simple assembly program with a .data section with the strings and then the .text (.code) section with some instructions pushing the arguments to the stack and then a call to the API function. For pretty much every google result I got, what I got back was a heavily abstracted version of how this is generally done, and the ironic bonus: NO ASSEMBLY INSTRUCTIONS!</p>\n<p>Before I get to that, I will say that I eventually figured out the way to do this with real assembly language in the source file. And it was as straight forward as I would have expected it to be. For reference, here is a screenshot of the source program:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e01bc77c62.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e01bc77c62.jpg\" width=\"436\" height=\"273\"></a></p>\n<p>This is what it looks like in the x64dbg debugger:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e8e8b48d0d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e8e8b48d0d.jpg\" width=\"1046\" height=\"426\"></a></p>\n<p>Note that the assembly looks awfully similar to the source. This is no mistake. This is exactly what I'm going for here. Remembering that my goal is to try and understand what is actually going on with these API functions, this is the most comprehensible way to go about this. You'll notice that all the arguments are on the stack and ready to go for when I'm about to call them. And it is extremely clear how they all got onto the stack (the 4 preceding push instructions).</p>\n<p>Okay. Now let's talk about the 'no assembly required' way that is recommended to write this. Because the source code is easier to read. Because it's 'cleaner code.' Because assembly language is so 'hard' to write that you might as well write assembly programs that don't use assembly instructions (then just give up and fucking use python). Anyway, here's a screenshot of the 'clean' way to do this:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d52171a2d9.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d52171a2d9.jpg\" width=\"837\" height=\"114\"></a></p>\n<p>It is clearer to read. If there were no comments in my version, then the 'invoke' version would be much more obvious in its intentions. But now, here's a screenshot of how dirty and incomprehensible this is in the debugger:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/359669f301.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/359669f301.jpg\" width=\"1049\" height=\"426\"></a></p>\n<p>Before I start ranting and criticizing, I have to be fare and state that the examples I found on the Internet didn't use a .data section and inlined the strings in the invoke section (cleaner source code). This is the real cause of the mess of the disassembly. Had I used a .data section with this invoke command: 'invoke MessageBox,HWND_DESKTOP,message,title1,MB_OKCANCEL', it woulnd't be so bad. I digress. So note that even though the source code is 'clean,' what's actually being 'assembled' (compiled really) is nothing but. You see as we are about to make the call, all the right arguments are on the stack. I see two of the original pushes needed for two of our arguments (push 1 and push 0). We also need two more arguments; we need pointers to our strings for the title of the window and the message in the window. How on earth did these get into the stack, and what the fuck are these confusing instructions doing in our program. Do we really need to do ARPL, INSB, OUTSD, DAA, and IMUL instructions? Well no, that's not what is happening. What we are actually seeing is a disassembled representation of our strings. See our first call to 'syscalls.40201B', it's jumping past our first string. A call normally knows how to return to where we came from by pushing the address of the next instruction to the stack. In this case though, our program doesn't intend to return to this at all, it is using that pushed address as a side effect, as that address really is the first byte of our string, it serves as a pointer to it, and it is now on the stack conveniently as an argument. So that call jumps us to another call that does the same thing; it skips over the next string that follows it, getting a pointer to it on the stack, indirectly. So that second call instruction brings us all the way down to the 'push 0' instruction right before our API call to MessageBoxA. These abused CALL instructions are how we got the string arguments onto the stack.</p>\n<p>The end result is the same. As somebody that has to read or write the assembly source, using invoke is likely a better way to write and collaborate. However, nothing about it is actual assembly language, it abstracts it away. It's not like this behavior is uncommon or indefensible. Compilers do this kind of thing all the time, even when they aren't optimized that much (and when they are optimized, wow). Joking aside, using invoke is probably the way to go if your writing something more serious, although, why not just use C? Writing \"assembly\" in shortcuts and macros with no actual assembly sounds a lot like a higher level language (like C). This is why I always found HLA (High Level Assembly) so objectionable. Though to be clear, I respect the Author of HLA and he has done other really amazing work.</p>\n<p>A lot of arguments of which way is better than which (with many things) comes down to what your doing at the moment. In the use case from the paragraph above, invoke away. But to return to my use case, I'm trying to familiarize myself with some simple Windows API calls by playing with different arguments in assembly and calling them, and then watching them perform their actions in a debugger (as not all API's will do something visual; I might have to see the stack, registers, and memory getting manipulated). Using invoke for this strategy makes this process all the more confusing.</p>\n<p>All this said, you might be able to see why I have a little ways to go when it comes to fully reverse engineering Windows binaries. Not to be confused with targeted reversing. I'm somewhat adequate with looking at particular APIs and pulling out IOCs from the artifacts they leave behind, and all the other 'cheater' dynamic forms of analysis. But if I ever want to see a bigger and fuller picture, I'm going to want to start writing the assembly that I'm reading and put bigger pieces of the puzzle together. At least, that's the plan.</p>\n",
				
				"date_published": "2018-10-05T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2018/10/05/invoke-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2018/09/25/arm-assembly-is-too-high.html",
				"title": "ARM Assembly is Too High Level - ROR and RRX",
				"content_html": "<p>Note: If you prefer video format to reading stuff, there's a companion video for this: <a href=\"https://youtu.be/ONQLWdd5nuc\">https://youtu.be/ONQLWdd5nuc</a></p>\n<p>Looking at instruction encodings, 'ROR r0, #0' should be the same as 'RRX r0, r0'.</p>\n<p>Let's first take a look at the encoding for the ROR instruction:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e959ab5b2c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e959ab5b2c.jpg\" width=\"898\" height=\"96\"></a></p>\n<p> </p>\n<p>So Rm gets rotated imm5 places and gets stored into Rd</p>\n<p>Now let's look at the encoding for RRX:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/344f632120.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/344f632120.jpg\" width=\"898\" height=\"94\"></a></p>\n<p> </p>\n<p>Note that the encoding is identical to ROR, with the exception that the imm5 field is harcoded to 0.</p>\n<p>So if we were to write 'ROR r0, #0', we should expect this to disassemble as an RRX instruction; as we are merely mimicking these hardcoded zeros by providing '#0' as the imm5 value:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/436a2b06b8.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/436a2b06b8.jpg\" width=\"858\" height=\"447\"></a></p>\n<p> </p>\n<p>Yep! wait wut? Very much no. What is the encoding for this MOV instruction?</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/915045e6b1.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/915045e6b1.jpg\" width=\"897\" height=\"94\"></a></p>\n<p> </p>\n<p>Come on assembler, why don't you know that I wanted to do an RRX when properly writing ROR in such a way that should encode into RRX? You have no idea how many times I legitimately need to use ROR to use RRX instead of actually typing RRX in assembly (even if it's zero times).</p>\n<p>As a reminder from the ARM Assembly is Too High Level: Moving by Shifting post, the MOV instruction is just an LSL with it's immediate value set to 0's. If we were to take the 5 (of the 8) zeros following Rd and make them something else, the above MOV instruction would become an LSL by that amount. After these 5 bytes, the next 2 bytes define what kind of shift, rotate (or MOV) we are doing. For example, MOV and LSL are '00', LSR is '01', ASR is '10', and RRX and ROR share '11' (you can see this '11' field occur after the imm5 field in the first screenshot above). I really don't know why the assembler jumped from a ROR instruction to the MOV/LSL type instruction when giving it an imm5 value of zero. Sometimes assembly language is too high level for me to comprehend.</p>\n",
				
				"date_published": "2018-09-25T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2018/09/25/arm-assembly-is-too-high.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2018/07/14/arm-assembly-is-too-high.html",
				"title": "ARM Assembly is Too High Level - Moving by Shifting",
				"content_html": "<p>The syntax for the register form of a Logical Shift Left is:<br>\nLSL{S}c Rd, Rm, #imm5</p>\n<p>This will take a value stored in the source register of Rm and shift the bits left by the amount defined in #imm5. The result of this operation is stored in the destination register of Rd. Like many instructions on ARM, you can make it conditional defined by a condition code (c) and define if you want the flags to be set as well with {S}.</p>\n<p>The real question (for freaks like me) is what happens if you use #0 as the shift value, doesn't it just effectively move Rm into Rd? Wouldn't LSL r0, r1, #0 do the same thing as MOV r0, r1?</p>\n<p>To say that the answer is 'yes' should be obvious and also boring. The interesting thing is that ARM decides to make the machine encoding for LSL Rd, Rm, #0 equivalent to MOV Rd, Rm. Look at the encoding graphic near the bottom of the post for further context of this and the following info.</p>\n<p>What about LSR? It effectively does the same thing as our MOV when imm5 is 0, but the machine encoding is off by one bit. Interestingly, LSL can have shifts from 0-31 (well, the manual says 1-31...), but LSR has shifts from 1-32, and there is no fucking way to force it to shift by 0. If directly writing the binary for the imm5 field of this instruction, and setting it to 00000, it actually ends up being LSR Rd, Rm, #32; that's how they fit that 32 in there (as 11111 is still 31 in binary). You can look to the last image in this post for a comparison how the source assembly is encoded.</p>\n<p>So the next question is what happens when we just write LSR Rd, Rm, #0? If what the ARM manual says is true about the 1-32 range, this instruction should be illegal (it kind of is). In the case of the 'as' assembler, it changes this to a MOV Rd, Rm, but with the exact same machine encoding as the LSL. See, the 3 bits between the imm5 and Rm field should be '010' for an LSR, it is '000' for the MOV and LSL. So the LSR Rd, Rm, #0 gets encoded as an actual different instruction even at the machine encoding level.</p>\n<p>This is all interesting. It just goes to show that it is regular behavior for an assembler to just take your shitty assembly and do whatever it fucking feels like with it. This is, I should remind everyone, because assembly language is too high level.</p>\n<p>For reference, here is a nice little graphic illustrating the machine encodings of each instruction discussed:<br>\n<a><img src=\"https://xlogicx.micro.blog/uploads/2026/screen-shot-2018-07-13-at-10.57.35-pm.png\" width=\"600\" height=\"192\" alt=\"\"></a></p>\n<p>Also for reference is some source assembly instructions (cat'ed out) followed by a disassembly of the assembled program (using objdump):<br>\n<a><img src=\"https://xlogicx.micro.blog/uploads/2026/screen-shot-2018-07-13-at-11.11.00-pm.png\" width=\"600\" height=\"621\" alt=\"\"></a></p>\n",
				
				"date_published": "2018-07-14T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2018/07/14/arm-assembly-is-too-high.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2017/04/13/sedregex-based-brainfuck-compiler.html",
				"title": "sed-regex Based BrainFuck Compiler",
				"content_html": "<p>BrainFuck is an 'esoteric' programming language with only 8 <em>one</em>-character instructions. I've used it here-and-there for well over a decade. I love minimalist languages, so RISCy. A brainfuck environment operates on a large array of data. There's an instruction to move the pointer in this array forwards and backwards and to increment or decrement it's value...that's already half the language. There's also an instruction for input or output of 1 character. Finally, there's an instruction to start and stop a loop. That's the language. There are many interpreters out there, I'm sure there are some compilers too, but I wanted to write a one-liner-esque command with common simple *nix tools. I actually ended up with 4 separate commands, but we will get to all of that.</p>\n<p> </p>\n<h3><strong>Technical notes:</strong></h3>\n<p><strong>sed</strong> is the main tool I used to compile the BrainFuck code into x86. <strong>dd</strong>, <strong>printf</strong>, <strong>tr</strong>, and <strong>cat</strong> are helpers with formatting and staging the resulting 512-byte boot image output.</p>\n<p>This does mean that the output file is limited to 512 bytes (more like 470-ish bytes after some of the overhead).</p>\n<p>Each BrainFuck command takes different amounts of bytes to encode:<br>\n&gt;: 1 byte<br>\n&lt;: 1 byte<br>\n+: 2 bytes<br>\n-: 2 bytes<br>\n,: 12 bytes<br>\n.: 5 bytes<br>\n[: 3 bytes<br>\n]: 10 bytes</p>\n<p>The looping stack (controlled by '[', and ']') has 4k allocated to it. It is unlikely you will come to the nested loop limit with this.</p>\n<p>The data stack that you control with &lt;, &gt;, +, and - is about 21.5k (in theory)</p>\n<p> </p>\n<h3><strong>The Compiler command:</strong></h3>\n<p><strong>dd bs=512 count=1 if=/dev/zero of=out.bin; printf '\\x31\\xc0\\x8e\\xd8\\x8e\\xd0\\xbc\\x00\\x9c\\xbb\\x00\\xac\\xb4\\xb8\\x8e\\xc0\\xb0\\x03\\xcd\\x10\\xb4\\x01\\xb5\\x26\\xcd\\x10\\xb9\\xd0\\x07\\xb8\\x20\\x00\\x31\\xff\\xf3\\xab\\x31\\xff' | dd of=out.bin bs=1 count=38 conv=notrunc ;printf '\\x55\\xaa' | dd of=out.bin bs=1 seek=510 count=2 conv=notrunc; cat source.bf | tr '\\n' 'x' | sed -e 's/x//' -e 's/&gt;/\\x43/g' -e 's/&lt;/\\x4b/g' -e 's/\\+/\\xfe\\x07/g' -e 's/\\-/\\xfe\\x0f/g' -e 's/,/\\xb4\\x01\\xcd\\x16\\x74\\xfa\\x30\\xe4\\xcd\\x16\\x88\\x07/g' -e 's/\\./\\x8a\\x07\\xb4\\x0f\\xab/g' -e 's/\\[/\\xe8\\x00\\x00/g' -e 's/\\]/\\x5a\\x52\\x80\\x3f\\x00\\x74\\x02\\xff\\xe2\\x5a/g' | dd of=out.bin bs=1 seek=38 conv=notrunc</strong></p>\n<p>Let's go through these 4 separate commands in detail:</p>\n<p><strong>dd bs=512 count=1 if=/dev/zero of=out.bin</strong><br>\nWe are using an output file of out.bin. We want to start with a blank 512 byte file. So we use dd, set the block size to 512 bytes and use /dev/zero as the source of nulls to pad this file with.</p>\n<p><strong>printf '\\x31\\xc0\\x8e\\xd8\\x8e\\xd0\\xbc\\x00\\x9c\\xbb\\x00\\xac\\xb4\\xb8\\x8e\\xc0\\xb0\\x03\\xcd\\x10\\xb4\\x01\\xb5\\x26\\xcd\\x10\\xb9\\xd0\\x07\\xb8\\x20\\x00\\x31\\xff\\xf3\\xab\\x31\\xff' | dd of=out.bin bs=1 count=38 conv=notrunc</strong><br>\nIn this command we are sending machine code to dd. It's 38 bytes of machine code (count=38). At a high level, this machine code sets up the environment so we have our stacks, video memory (for output), a routine to clear the screen, etc... Below is what a dis<span style=\"color: #0000ff;\">assembly</span> of this would look like (as it shows the <span style=\"color: #ff0000;\">machine code</span> too). Even though this is partially an objdump output, I have included <span style=\"color: #339966;\">comments</span> too<br>\n<span style=\"color: #ff0000;\">31 c0</span>        <span style=\"color: #0000ff;\">xor    ax,ax</span>        <span style=\"color: #339966;\">;set ax to 0</span><br>\n<span style=\"color: #ff0000;\">8e d8</span>        <span style=\"color: #0000ff;\">mov    ds,ax</span>        <span style=\"color: #339966;\">;DS=0</span><br>\n<span style=\"color: #ff0000;\">8e d0</span>        <span style=\"color: #0000ff;\">mov    ss,ax</span>        <span style=\"color: #339966;\">;stack starts at 0</span><br>\n<span style=\"color: #ff0000;\">bc 00 9c</span>    <span style=\"color: #0000ff;\">mov    sp,0x9c00</span>    <span style=\"color: #339966;\">;200h past code start, 4k of stack</span><br>\n<span style=\"color: #ff0000;\">bb 00 ac</span>    <span style=\"color: #0000ff;\">mov    bx,0xac00</span>    <span style=\"color: #339966;\">;brainfuck stack location</span><br>\n<span style=\"color: #ff0000;\">b4 b8</span>        <span style=\"color: #0000ff;\">mov    ah,0xb8</span>        <span style=\"color: #339966;\">;text video memory</span><br>\n<span style=\"color: #ff0000;\">8e c0 </span>       <span style=\"color: #0000ff;\">mov    es,ax</span>        <span style=\"color: #339966;\">;ES=0xB800</span><br>\n<span style=\"color: #ff0000;\">b0 03 </span>       <span style=\"color: #0000ff;\">mov    al,0x3</span>        <span style=\"color: #339966;\">;Set Video Mode</span><br>\n<span style=\"color: #ff0000;\">cd 10</span>        <span style=\"color: #0000ff;\">int    0x10</span>            <span style=\"color: #339966;\">;syscall</span><br>\n<span style=\"color: #ff0000;\">b4 01</span>        <span style=\"color: #0000ff;\">mov    ah,0x1</span>       <span style=\"color: #339966;\"> ;Set Cursor Shape</span><br>\n<span style=\"color: #ff0000;\">b5 26</span>        <span style=\"color: #0000ff;\">mov    ch,0x26</span><br>\n<span style=\"color: #ff0000;\">cd 10</span>        <span style=\"color: #0000ff;\">int    0x10</span>            <span style=\"color: #339966;\">;syscall</span><br>\n<span style=\"color: #ff0000;\">b9 d0 07</span>    <span style=\"color: #0000ff;\">mov    cx,0x7d0</span>       <span style=\"color: #339966;\"> ;whole screens worth</span><br>\n<span style=\"color: #ff0000;\">b8 20 00</span>    <span style=\"color: #0000ff;\">mov    ax,0x20</span>       <span style=\"color: #339966;\"> ;empty black background</span><br>\n<span style=\"color: #ff0000;\">31 ff </span>       <span style=\"color: #0000ff;\">xor    di,di</span>        <span style=\"color: #339966;\">;upper left corner</span><br>\n<span style=\"color: #ff0000;\">f3 ab</span>        <span style=\"color: #0000ff;\">rep stos WORD PTR es:[di],ax</span>    <span style=\"color: #339966;\">;clear screen</span><br>\n<span style=\"color: #ff0000;\">31 ff </span>       <span style=\"color: #0000ff;\">xor    di,di </span>       <span style=\"color: #339966;\">;set cursor back to upper left corner</span></p>\n<p><strong>printf '\\x55\\xaa' | dd of=out.bin bs=1 seek=510 count=2 conv=notrunc</strong><br>\nThe image file needs to end with 0x55aa to be considered a bootable image. So we use printf and dd to inject these bytes into the image</p>\n<p><strong>cat source.bf | tr '\\n' 'x' | sed -e 's/x//' -e 's/&gt;/\\x43/g' -e 's/&lt;/\\x4b/g' -e 's/\\+/\\xfe\\x07/g' -e 's/\\-/\\xfe\\x0f/g' -e 's/,/\\xb4\\x01\\xcd\\x16\\x74\\xfa\\x30\\xe4\\xcd\\x16\\x88\\x07/g' -e 's/\\./\\x8a\\x07\\xb4\\x0f\\xab/g' -e 's/\\[/\\xe8\\x00\\x00/g' -e 's/\\]/\\x5a\\x52\\x80\\x3f\\x00\\x74\\x02\\xff\\xe2\\x5a/g' | dd of=out.bin bs=1 seek=38 conv=notrunc</strong><br>\nThis is the compound command that does the real work!</p>\n<p>First we feed the source file to our chain of commands with '<strong>cat</strong>'<br>\nWe then use <strong>tr</strong> (translate) to replace the ending newline with the x character (because we can't just make it blank / remove it with tr)<br>\nThen we have a long line of find/replace commands with <strong>sed</strong>. The first one is simple, we now can just remove that 'x' that we put in there, so there are no longer any newlines.<br>\nThe rest of the find/replace (regex) <strong>sed</strong> arguments are to literally find BrainFuck instructions and replace it with equivalent machine code. Before we explore each instruction, let's make some notes about our environment. I mentioned that we are using a large array in our environment, we will be using the 16 bit 'bx' register to track this; we use it as a pointer that we increment and decrement. I considered not even using a real stack for this resulting program, because it is mostly not needed as a concept, but I ended up needing it for the looping instructions, which I will get to</p>\n<h4><strong>Instructions:</strong></h4>\n<p>Instruction: <strong>&gt;</strong> (increment array pointer)<br>\n<span style=\"color: #ff0000;\">43</span>           <span style=\"color: #0000ff;\"> inc    bx</span>   <span style=\"color: #339966;\"> ;bx is our pointer, increment it</span></p>\n<p>Instruction: <strong>&lt;</strong> (decrement array pointer)<br>\n<span style=\"color: #ff0000;\">4b </span>           <span style=\"color: #0000ff;\">dec    bx </span>   <span style=\"color: #339966;\">;decrement pointer</span></p>\n<p>Instruction: <strong>+</strong> (increment value in array)<br>\n<span style=\"color: #ff0000;\">fe 07</span>        <span style=\"color: #0000ff;\">inc    BYTE PTR [bx]</span>    <span style=\"color: #339966;\">;increment the value</span></p>\n<p>Instruction: <strong>-</strong> (decrement value in array)<br>\n<span style=\"color: #ff0000;\">fe 0f </span>       <span style=\"color: #0000ff;\">dec    BYTE PTR [bx]</span>    <span style=\"color: #339966;\">;decrement the value</span></p>\n<p>Instruction: <strong>,</strong> (Take keyboard input)<br>\n<span style=\"color: #ff0000;\">b4 01 </span>       <span style=\"color: #0000ff;\">mov    ah,0x1 </span>              <span style=\"color: #339966;\"> ;check if keyboard input</span><br>\n<span style=\"color: #ff0000;\">cd 16</span>        <span style=\"color: #0000ff;\">int    0x16 </span>                   <span style=\"color: #339966;\">;interupt for it</span><br>\n<span style=\"color: #ff0000;\">74 fa</span>       <span style=\"color: #0000ff;\"> je     0x2c</span>                   <span style=\"color: #339966;\"> ;keep checking if no input</span><br>\n<span style=\"color: #ff0000;\">30 e4</span>        <span style=\"color: #0000ff;\">xor    ah,ah </span>              <span style=\"color: #339966;\"> ;get keypress</span><br>\n<span style=\"color: #ff0000;\">cd 16</span>        <span style=\"color: #0000ff;\">int    0x16</span>                   <span style=\"color: #339966;\"> ;interupt for it</span><br>\n<span style=\"color: #ff0000;\">88 07</span>       <span style=\"color: #0000ff;\"> mov    BYTE PTR [bx],al  </span>      <span style=\"color: #339966;\">;put keypress in array</span></p>\n<p>Instruction: <strong>.</strong> (Print character)<br>\n<span style=\"color: #ff0000;\">8a 07</span>       <span style=\"color: #0000ff;\"> mov    al,BYTE PTR [bx] </span>       <span style=\"color: #339966;\">;get character from array</span><br>\n<span style=\"color: #ff0000;\">b4 0f</span>        <span style=\"color: #0000ff;\">mov    ah,0xf </span>               <span style=\"color: #339966;\">;make sure video background is still black (possible this isn't needed)</span><br>\n<span style=\"color: #ff0000;\">ab </span>           <span style=\"color: #0000ff;\">stos   WORD PTR es:[di],ax </span>   <span style=\"color: #339966;\">;display the character</span></p>\n<p>Instruction: <strong>[</strong> (Start the loop)<br>\n<span style=\"color: #ff0000;\">e8 00 00</span>    <span style=\"color: #0000ff;\">call   0x40  </span>      <span style=\"color: #339966;\">;pushes address of next instruction onto the stack</span></p>\n<p>Instruction: <strong>]</strong> (End the loop if array at pointer is zero)<br>\n<span style=\"color: #ff0000;\">5a</span>            <span style=\"color: #0000ff;\">pop    dx </span>                 <span style=\"color: #339966;\">  ;get loop start</span><br>\n<span style=\"color: #ff0000;\">52 </span>           <span style=\"color: #0000ff;\">push   dx </span>                  <span style=\"color: #339966;\"> ;keep it on the stack</span><br>\n<span style=\"color: #ff0000;\">80 3f 00</span>    <span style=\"color: #0000ff;\">cmp    BYTE PTR [bx],0x0</span>    <span style=\"color: #339966;\">;is the value zero yet</span><br>\n<span style=\"color: #ff0000;\">74 02 </span>      <span style=\"color: #0000ff;\"> je     0x49 </span>                   <span style=\"color: #339966;\">;if so, the loop is over, jump past it</span><br>\n<span style=\"color: #ff0000;\">ff e2 </span>      <span style=\"color: #0000ff;\"> jmp    dx </span>                  <span style=\"color: #339966;\"> ;otherwise, go back to the beginning of the loop</span><br>\n<span style=\"color: #ff0000;\">5a</span>           <span style=\"color: #0000ff;\"> pop    dx</span>                   <span style=\"color: #339966;\"> ;return stack to pristine state</span></p>\n<p> </p>\n<h3><strong>PoC||GTFO:</strong></h3>\n<p>Just to show this works, I used the first two examples from Wikipedia's article on BrainFuck</p>\n<h4>Adding two values:</h4>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9558070464.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9558070464.jpg\" width=\"733\" height=\"382\"></a></p>\n<p> </p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f16b5c7560.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f16b5c7560.jpg\" width=\"1143\" height=\"394\"></a></p>\n<p> </p>\n<h4>Hello World:</h4>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/a4e042c593.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a4e042c593.jpg\" width=\"592\" height=\"713\"></a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e8b5035dbb.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e8b5035dbb.jpg\" width=\"1136\" height=\"417\"></a></p>\n<p> </p>\n<p>...because reasons</p>\n",
				
				"date_published": "2017-04-13T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2017/04/13/sedregex-based-brainfuck-compiler.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2017/02/09/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Commutative Property, Sometimes - it may save your byte",
				"content_html": "<p><a href=\"https://xlogicx.net/images/apples.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a36e62626b.jpg\" width=\"1024\" height=\"518\"></a></p>\n<p>I remember learning these properties in basic algebra: Associative, Distributive, and Commutative. It's the Commutative property that states that a + b = b + a. The same principle is true with multiplication. In x86 pointer math, of course the results of these operations follow the commutative property; that's just math. However, the machine encoding doesn't consistently take this into account. To be facetious with the blog title image, machine code takes apple color into account most of the time, assembly language just looks at the number of apples.</p>\n<p>To spoil everything from the beginning, 'xor byte [esp + eax], 0' is encoded the same as 'xor byte [eax + esp], 0'. In machine code, when using esp as one of two non-scaled registers as a pointer, the commutative property is acknowledged. However, any other of the 8 general purpose registers are not treated this way! In other words, 'xor byte [ebx + ecx], 0' is not encoded the same as 'xor byte [ecx + ebx], 0'</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/a6805177be.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a6805177be.jpg\" width=\"339\" height=\"30\"></a></p>\n<p>The claim I just made is slightly unfair though, or unfair-ish. Assemblers (like nasm) allow us to get kind of loose with our assembly. An instruction like 'xor byte [ebx + ecx], 0' isn't really showing the whole story. In assembly, these pointers are made of up 3 parts (all 3 are optional...ish): one base register, one scaled register, and one displacement (8/32 bit offset). Scaled registers can be multiplied by 1, 2, 4, or 8. So more accurately, the above instruction is actually a base register of 'ebx' + a scaled register of 'ecx' with a scale of 1 (multiplied by 1). In machine code, the encoding of a base register and scaled register are encoded entirely differently.</p>\n<p>With the above knowledge in mind, it's no surprise that 'xor byte [ebx + ecx * 1], 0' is not the same as 'xor byte [ecx + ebx * 1], 0'. Even if the result of what memory location this points to is the same (it is), it is now obvious why these are encoded differently...except for when it's the 'esp' register...</p>\n<p>When writing assembly, 'xor byte [eax + esp * 1], 0' would get encoded the same as 'xor byte [esp + eax * 1], 0'. The actual encoding for both more accurately represents the 2nd form of this instruction: xor byte [esp + eax * 1]. Remember, the esp register can not be scaled (Why ESP doesn't scale (But EBP can still Base) post). If I were to write 'xor byte [eax + esp * 2], 0' instead of '* 1', I would get an error from my assembler. Instead, my assembler (nasm) is clever enough to know that even though my instruction (scaling esp by 1) is not valid, it replaces it with an equivalent instruction (using the commutative property), and all is well. But without knowing machine code, this would all be happening magically behind the scenes to us, because assembly is too high level.</p>\n<p>Before we forget, lets take a look at ebp, because the ebp register also gets encoded differently sometimes in memory pointers. Even though esp can not in any case be encoded as a scaled register, making ebp the base register can be done, but comes with a compromise: the displacement component of the pointer is no longer optional. So if you didn't include a displacement in your assembly, the assembler will add a zeroed out byte as a displacement for you. In other words, 'xor byte [ebp + eax], 0' is actually more accurately 'xor byte [ebp + eax + 0x00], 0'.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/b6a05c9e74.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/b6a05c9e74.jpg\" width=\"355\" height=\"30\"></a></p>\n<p>So taken the above information about the encoding of ebp, we are faced with a trade-off. This is one of those times where my assembler takes me literally, in the sense that it obeys making ebp the base (first) register (even though it may add an extra null byte behind our backs to do it). So even though, 'xor byte [eax + ebp * 1], 0' logically does the same thing (commutative property), nasm does not choose this form, because it doesn't need to do this, like it does with the scaled esp register. The interesting thing is that this alternate form of the instruction is a byte less (because it doesn't need that displacement byte). The takeaway is that if you are using two unscaled registers in your pointer, and ebp is one of them, and you didn't already have a displacement: make ebp the last one (all to save one byte)</p>\n",
				
				"date_published": "2017-02-09T09:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2017/02/09/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2017/02/01/assembly-is-too-highlevel-signed.html",
				"title": "Assembly is Too High-Level - Signed Displacements",
				"content_html": "<p>For those that don't know about unsigned and signed data types, it's not all that complicated. One byte can hold a total of 256 possible values. If these values were only positive numbers and included zero, we would have a number range of 0-255. But what if we wanted negative numbers? The byte is divided; we now have a range of -128 through 127. When including zero, this is all 256 possible values. The data is formatted as [en.wikipedia.org/wiki/Two'...](https://en.wikipedia.org/wiki/Two's_complement)</p>\n<p>With a 32 bit register, we have four bytes, giving us 0-4,294,967,295 for unsigned values and -2,147,483,647 through 2,147,483,647 for a signed value.</p>\n<p>When using instructions that use memory locations for an operand, we are allowed to use literal displacements. For example, we can use an address/pointer stored in EAX and add 128 to the address (not value). This 128 is called a displacement, or offset. This displacement can be up to 32 bits, but if the displacement is small enough, it can be encoded as 8 bits.</p>\n<p>One thing to note however, is that this displacement is a signed value. At least for me, the documentation of this fact was subtle. And in practice, if you operate at the level of abstraction that assembly language provides and don't know this subtlety, it is possible to run into some unfortunate bugs.</p>\n<p>A good example is the two following lines of assembly language:<br>\nlea ebx, [eax - 1337]<br>\nlea ebx, [eax + 4294965959]</p>\n<p>As it turns out, when it comes to machine code, both of these instructions are completely identical. Remember to look up to the range of a 32 bit signed value, the range stops after around 2 billion; it doesn't go all the way up to 4 billion as in the above lea instruction example. This 4 billion value would be valid as an unsigned 32 bit number, however. Nasm doesn't appear to care about warning us; it just encodes this positive 4 billion number as if it were unsigned. But when you actually execute this instruction, it is surely treated as a signed number (which turns out to be -1337).</p>\n<p>Exhibit A:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/ce5c1dbd82.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/ce5c1dbd82.jpg\" width=\"537\" height=\"74\"></a></p>\n<p>We see that the 2nd line of assembly has the eax-1337, even though the machine code has a value of 0xfffffac7 (2's compliment little endian). My assembly source actually used lea ebx, [eax + 4294965959] to generate that line. Even though my source added a large source number, and the machine code appears to be a large number (if it weren't 2's compliment), the disassembled version in edb (not how my original source was written) is what is actually executed. We know this because 0x80000000 - 1337 is 0x7ffffac7 (the result that gets stored into ebx shown in the screenshot).</p>\n<p>Here's another thing to think about that can get glossed over when not paying attention to the details. In 32 bit, negative 1337 is represented as 0xfffffac7 in 2's compliment. if this value were to be interpreted as unsigned, it's value would be 4,294,965,959. This is where I got this value to use in my examples above.</p>\n<p>Does this same logic work in 8 bit? Say we use negative 100. This is 0x9c in 2's compliment and as an unsigned 0x9c is 156. So to review, these two instructions are the same:<br>\nlea ebx, [eax - 1337]<br>\nlea ebx, [eax + 4294965959]</p>\n<p>Does that mean that these two are too?<br>\nlea ebx, [eax - 100]</p>\n<p>The answer is no. Note the below screenshot showing the difference between adding 127 and 156:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/63d1d7935d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/63d1d7935d.jpg\" width=\"317\" height=\"30\"></a></p>\n<p>As a signed number, 156 would be too big to fit into 8 bits, so nasm graduates this to 32 bits instead, and it actually gets interpreted as positive, where as the 4 billion positive number is unable to graduate to anything higher than 32 bits, as there is no machine encoding for this (even in a 64 bit architecture; this encoding is not part of the Mod/RM encoding).</p>\n<p>What about this instruction?:<br>\nlea ebx, [eax + 4294967292]<br>\nBelow is the same instruction represented in hex instead of decimal:<br>\nlea ebx, [eax + 0xfffffffc]</p>\n<p>The resulting machine code is 8d58fc. Where'd all the f's go? They are leading f's, just as there's a such thing as leading zeros. As in 00003 is the same thing as 3. In twos compliment, f's in a negative number are leading f's. This number represents -4, this number is small enough to fit into 1 byte; all the leading ff bytes can be dropped.</p>\n<p>And on the other hand, the below assembly instruction:</p>\n<p>lea ebx, [eax + 0xfc]</p>\n<p>Has the machine code of 8d98fc000000; a bunch of leading zeros (looks like trailing due to little endian) get added to the machine code. nasm must pick between the ambiguities of 0xfc. Does the author mean '+ 252' or do they mean this to be a signed value of '-4'? Nasm chooses to interpret 0xfc as positive, but if it were to put this data into only one signed byte, it would actually represent negative 4, therefore has to be placed into 4 bytes.</p>\n",
				
				"date_published": "2017-02-01T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2017/02/01/assembly-is-too-highlevel-signed.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/08/29/boot-sector-graphical-programming-tutorial.html",
				"title": "Boot Sector Graphical Programming - Tutorial",
				"content_html": "<p>This tutorial is aimed at those that have some assembly experience, but very minimal 16-bit BIOS programming experience, in other words; a short list of some of my friends that I want to coerce into doing some BIOS programming.</p>\n<h3><strong>Assembling:</strong></h3>\n<p><strong>Qemu</strong><br>\nAssemble source: nasm yourboot.asm -f bin -o yourboot.bin<br>\nRun with qemu: qemu tronsolitare.bin<br>\nRun with qemu (alternate): qemu-system-i386 -hda yourboot.bin</p>\n<p><strong>VirtualBox</strong><br>\nCreate floppy image: Use this padding in 2nd to last line of code: 1440 * 1024) - ($ - $$) db 0 (instead of times 510-($-$$) db 0)<br>\nRun floppy image in VirtualBox: Create a low spec VM and set it to boot to yourboot.bin as the floppy image. Either rename image file to tronsolitare.img or use: nasm yourboot.asm -f bin -o yourboot.img</p>\n<p><strong>Disassembly</strong></p>\n<p>Disassemble tronsolitare.bin with objdump: objdump -D -b binary -mi386 -Maddr16,data16 tronsolitare.bin</p>\n<p> </p>\n<h3>BIOS Programming Environment:</h3>\n<p>These programs are small, as in only 512 bytes of code. Fortunately, you don't have to do absolutely everything, there are some extremely useful BIOS routines that you can call to do some heavy lifting. A good guide/lookup of these routines can be found at [wiki.osdev.org/BIOS.](http://wiki.osdev.org/BIOS.)</p>\n<p>In the below examples, an 80x25 (80 columns and 25 rows of characters) display is assumed. This isn't the only mode, it's just a mode I feel comfortable with. It's actually not too small. The main reason to not go too much bigger is the memory challenges that you will already have will get even more noticeable.</p>\n<p>Each 'character' is actually 2 bytes (16-bits) of information. The first 4 bits is the background color, the next 4 bits is the text color, and the last 8 bits is the actual character. The 8-bit character is not ASCII however (it is similar), it is code-437. [en.wikipedia.org/wiki/Code...](https://en.wikipedia.org/wiki/Code_page_437) has more information on this.</p>\n<p>You will have a register (di) that points to this video memory. It starts at the upper left of the screen. As you increment di, it stays on the same row and moves to the right. After it reaches the end of the row, it moves back to the left on the next row.</p>\n<p> </p>\n<h3><strong>Code Basics:</strong></h3>\n<p>This section will have snippets of code to get you started</p>\n<p> </p>\n<h4><strong>BIOS signature and padding</strong></h4>\n<p>This is really the only required part that you need in your bootable image. This should be your last 2 lines of code.</p>\n<blockquote>\n<h5>times 510-($-$$) db 0<br>\ndw 0xAA55</h5>\n</blockquote>\n<p>The last line of code is a 2-byte signature that must be at the end of your image file. The line of code above it makes sure that no matter how much code you write, your image file will be exactly 512 bytes (after your code, the rest of the file will be filled with nulls, and then end with the signature.</p>\n<h4></h4>\n<h4><strong>Nasm ORG directive</strong></h4>\n<p>This should be the first line of your assembly file when using the nasm assembler</p>\n<blockquote>\n<h5>   [ORG 0x7c00]</h5>\n</blockquote>\n<p> </p>\n<h4><strong>Basic Video (and stack) Setup</strong></h4>\n<blockquote>\n<h5>xor ax, ax ;make it zero<br>\nmov ds, ax ;DS=0</h5>\n<h5>mov ss, ax ;stack starts at 0. Add these lines only if you plan to use stack (Call/Ret, Push/Pop, etc...)<br>\nmov sp, 0x9c00 ;200h past code start</h5>\n<h5>mov ah, 0xb8 ;text video memory<br>\nmov es, ax ;ES=0xB800</h5>\n<h5>mov al, 0x03<br>\nint 0x10</h5>\n<h5>mov ah, 1<br>\nmov ch, 0x26<br>\nint 0x10</h5>\n</blockquote>\n<p>This code does all of the BIOS video overhead. It initializes the data segment (to zero), it allocates an area of memory for the stack and puts the pointer at 0x9c00 (assuming you'll be using the stack, and you probably should), initializes video memory at 0xb800, sets the video mode to 80x25 (80 columns by 25 rows), and hides the cursor. For more video BIOS info, check out [en.wikipedia.org/wiki/INT_...](https://en.wikipedia.org/wiki/INT_10H.)</p>\n<p> </p>\n<h4><strong>Some notable Registers and Instructions for Video</strong></h4>\n<p>di - pointer to video memory location<br>\nax - pixel data bfcc (4 bit background color, 4 bit foreground color, 8-bit character (code 437))<br>\ncx - counter used in combination with stosw<br>\nstosw - puts ax data into di (video memory) and decrements cx<br>\nscasw - way to increment di with 1 byte (even though it's supposed to scan a string...)<br>\ncbw - 1 byte trick to zero out ah (zero mask) and leave al intact so long as al is less than 0x80</p>\n<p> </p>\n<h4><strong>A Simple Time Delay Loop</strong></h4>\n<blockquote>\n<h5>mov bx, [0x046C] ;Get timer state<br>\nadd bx, 2 ;2 ticks (can be more)<br>\ndelay:<br>\ncmp [0x046C], bx<br>\njb delay</h5>\n</blockquote>\n<h4><strong>Example of keyboard input for arrow keys</strong></h4>\n<p>This may look a little complicated, but it's not so bad. It's also the most reliable way that I've found to take keyboard input without lag, with continuous polling, and remembers last key pressed. For more keyboard information, check out [en.wikipedia.org/wiki/INT_...](https://en.wikipedia.org/wiki/INT_16H)</p>\n<blockquote>\n<h5>LEFT EQU 75<br>\nRIGHT EQU 77<br>\nUP EQU 72<br>\nDOWN EQU 80</h5>\n<h5>;............<br>\n;............</h5>\n<h5>;Get keyboard state<br>\nmov ah, 1<br>\nint 0x16<br>\npop ax<br>\njz persisted ;if no keypress, jump to persisting move state</h5>\n<h5>;Clear Keyboard buffer<br>\nxor ah, ah<br>\nint 0x16</h5>\n<h5>;Otherwise, move in direction last chosen<br>\npersisted:<br>\npush ax<br>\n;Check for directional pushes and take action<br>\ncmp ah, LEFT<br>\nje left<br>\ncmp ah, RIGHT<br>\nje right<br>\ncmp ah, UP<br>\nje up<br>\ncmp ah, DOWN<br>\njne mainloop</h5>\n<h5>down:<br>\n;do down stuff</h5>\n<h5>left:<br>\n;do left stuff<br>\nup:<br>\n;do up stuff<br>\nright:<br>\n;do right stuff</h5>\n<h5>;do defualt stuff<br>\nret</h5>\n</blockquote>\n<h4><strong>Random<br>\n</strong></h4>\n<p>This instruction actually reads the time-stamp counter and puts the returned value into the ax register. Because of this, low bits are more random than high bits.</p>\n<blockquote>\n<h5>   rdtsc</h5>\n</blockquote>\n<h3><strong>Basic Hello World:</strong></h3>\n<blockquote>\n<h5>;Hello World<br>\n[ORG 0x7c00]</h5>\n<h5>;Init the environment<br>\nxor ax, ax                       ;make it zero<br>\nmov ds, ax                     ;DS=0</h5>\n<h5>mov ss, ax                     ;stack starts at 0<br>\nmov sp, 0x9c00            ;200h past code start</h5>\n<h5>mov ah, 0xb8                ;text video memory<br>\nmov es, ax                    ;ES=0xB800</h5>\n<h5>mov al, 0x03<br>\nint 0x10</h5>\n<h5>mov ah, 1<br>\nmov ch, 0x26<br>\nint 0x10</h5>\n<h5>;Fill in all blue<br>\nmov cx, 0x07d0               ;whole screens worth<br>\nmov ax, 0x1f20                ;empty blue background<br>\nxor di, di<br>\nrep stosw                        ;push it to video memory</h5>\n<h5>;Print Hello World<br>\nmov di, 0x07c4                     ;coord to start 'YOU WIN!' message<br>\nhelloloop: mov al, [hello]    ;get win message pointer<br>\nmov ah, 0x0f                         ;white text on black background<br>\nstosw                                     ;commit char to video memory<br>\ninc byte [helloloop + 0x01]  ;next character (self modifying code, 'helloloop: mov al, [hello]' is '[hello + 1]' next, and then '[hello + 2]', etc...)<br>\ncmp di, 0x07e4                     ;is it the last character?<br>\njne helloloop</h5>\n<h5>;Infinite Loop To end the program<br>\nendloop:<br>\njmp endloop</h5>\n<h5>;Hello World text<br>\nhello:<br>\ndb 0x02, 0x20, 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64, 0x21, 0x20, 0x02</h5>\n<h5>;BIOS sig and padding<br>\ntimes 510-($-$$) db 0<br>\ndw 0xAA55</h5>\n</blockquote>\n<h3>Colors:</h3>\n<p>I made a boot sector program (https://github.com/XlogicX/colors) to display all of the foreground/background colors. These are the hex codes that would be in the upper part of ax (ah) register right before a stosw. As an example, knowing that 0x58 is the character for 'X', E458 in ax right before a stosw would produce a red 'X' in a yellow background.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e4fdc0d693.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e4fdc0d693.jpg\" width=\"722\" height=\"427\"></a></p>\n<p> </p>\n<h3><strong>Showcase:</strong></h3>\n<p>Some examples of some projects out there that are boot sectors that don't boot to an OS (that I know of):</p>\n<p>Tetranglix (Tetris Clone with scoring)<br>\n[github.com/Shikhin/t...](https://github.com/Shikhin/tetranglix)<br>\n[github.com/XlogicX/t...](https://github.com/XlogicX/tetranglix) (nanochess added color)</p>\n<p>TetrOS (Very old color Tetris Clone, but no score)<br>\n[github.com/daniel-e/...](https://github.com/daniel-e/tetros)</p>\n<p>TronSolitare (Snake with Tron tailing)</p>\n<p>[github.com/XlogicX/t...](https://github.com/XlogicX/tronsolitare)</p>\n<p>Battle Snakes (Tron like clone)</p>\n<p>[github.com/darkvoxel...](https://github.com/darkvoxels/battlesnakes)</p>\n<p>BootSlide (15 number slider puzzle)</p>\n<p>[github.com/XlogicX/B...](https://github.com/XlogicX/BootSlide)</p>\n<p>Boot Me, Crack Me (Boot Sector CrackMe Game)</p>\n<p>[xlogicx.net](http://xlogicx.net/?page_id=618)</p>\n<p>Validation (Social Commentary Slot Machine Game, private repo, coming soon)</p>\n<p>[github.com/XlogicX/V...](https://github.com/XlogicX/Validation)</p>\n<p>Boot-Man (Pac-Man Game)</p>\n<p>[github.com/guyhill/B...](https://github.com/guyhill/Boot-Man)</p>\n<p>Pillman (Another Pac-Man Game</p>\n<p>[github.com/nanochess...](https://github.com/nanochess/Pillman)</p>\n<p>Dasher512 (Puzzle Game)</p>\n<p>[github.com/QiZD90/da...](https://github.com/QiZD90/dasher512)</p>\n<p>Snake</p>\n<p>[github.com/JulianSlz...](https://github.com/JulianSlzr/project512)</p>\n<p>Invaders</p>\n<p>[github.com/nanochess...](https://github.com/nanochess/Invaders)</p>\n<p>Fbird (Flappy Bird Clone)</p>\n<p>[github.com/nanochess...](https://github.com/nanochess/fbird)</p>\n<p>Petty Bird (Another Flappy Bird Clone</p>\n<p>[github.com/XanClic/5...](https://github.com/XanClic/512-petty-bird)</p>\n<p>GlitchHeArt (Visual)</p>\n<p>[github.com/kurtunder...](https://github.com/kurtunderscore/HelloPixelArt)</p>\n<p>Goatse (Image)</p>\n<p>[github.com/jbremer/g...](https://github.com/jbremer/goatse.mbr)</p>\n<p>Boot2Sol (Game)</p>\n<p>[github.com/masneyb/b...](https://github.com/masneyb/boot2sol)</p>\n<p>Nyanboot (Animated Image)</p>\n<p>[github.com/XanClic/n...](https://github.com/XanClic/nyanboot)</p>\n<p>Phosphene (Hi-Def Animated Fractal)</p>\n<p>[github.com/kmcallist...](https://github.com/kmcallister/phosphene)</p>\n<p>512B-bootloader-effect (Animated Graphic)</p>\n<p>[github.com/pjanczyk/...](https://github.com/pjanczyk/512B-bootloader-effect)</p>\n",
				
				"date_published": "2016-08-29T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/08/29/boot-sector-graphical-programming-tutorial.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/05/11/cactuscon-slides-machining-a-love.html",
				"title": "CactusCon Slides - Machining - A Love Story",
				"content_html": "<p>Here is the full ~6Mb image that I used as my slide deck within MS Paint in Windows 3.1 for my CactusCon 2016 presentation: Machining, A Love Story. Below the large image are all the images again, slide-by-slide, with brief notes; so there can be some context. All non-screenshot art done by KRT c0c4!N (my lovely girlfriend), it should be noted that I limited her to 16 colors with a specific pallet.</p>\n<p><a><img src=\"https://xlogicx.micro.blog/uploads/2026/selection-019.png\" width=\"418\" height=\"858\" alt=\"\"></a></p>\n<p>The intro slide:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/37754bec72.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/37754bec72.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>A slide showing the 2016 CactusCon art in less than 16 colors:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/ad386f0322.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/ad386f0322.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>The 'ToC' slide summing up what's to come:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9644864f2e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9644864f2e.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>As a teenager, I got my first family computer, a 486DX. Playing games on a SNES or SEGA I felt like I was playing games from the 'gods.' But playing games on a computer made me realize I was using the same platform that could allow me to be one of the gods: I wanted to learn to program.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/a954283307.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a954283307.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I tried QBasic, which was fun at first. But I didn't want an interpreted language. I wanted to write software where I could just run the executable standalone; where the program was the machine code meant for the processor, not an interpreter.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/bf9876f172.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/bf9876f172.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I wanted to see what a 'real' program looked like. So I dropped a program into Notepad and inspected. Even though I knew this code was not printable, I still had a feeling that if I could understand these characters, and had the right editor, I would have all I needed to write software (this assumption turns out to be correct, it's just too bad I didn't find the answer until way later in life)</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/ab2b4c494e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/ab2b4c494e.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I found the nerdiest friend I new in school and asked him:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/8e66f054cc.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/8e66f054cc.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>He responds:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/939f14c865.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/939f14c865.jpg\" width=\"684\" height=\"491\"></a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/a121c59cb5.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a121c59cb5.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>So I ask:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f2a1d3f206.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f2a1d3f206.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>He responds:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/39137af76c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/39137af76c.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Remembering my exploration of a program inspected in notepad:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9fb2eb6ae2.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9fb2eb6ae2.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>He still persists, only knowing what he has heard, with no appreciation that there are lower levels handed to us by the false gods of abstraction:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/73cbc77063.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/73cbc77063.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I remember this conversation for eternity. It is the moment I start to hate abstractions, to fundamentally know that if something can't be done at the layer of abstraction we are dealing with, one must only go a level deeper and repeat if needed (even though lower levels of abstraction are more difficult to deal with, they always come with more control and power):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e35583a887.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e35583a887.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>This starts my journey to learning programming, assembly language, and machine code. I remember programming in BASIC for this TI-82. But then I learned you can program in assembly for it (Z80 chip). My first program cleared the screen (as intended). My second experiment cleared the memory (it was meant to be 'Hello World'). I gave up on this for a little while.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f5c4df50a4.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f5c4df50a4.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Then I formally learned assembly (and even machine code) for the Motorola 68HC11 embedded system. For class, we didn't get a text-book. Instead we had a lab manual and the Motorola reference manual for this chip. The reference manual had every instruction and even the corresponding machine code for each instruction. After doing all of the labs, my personal project was to try to write some code that would replicate itself into memory right after itself. This required an appreciation for machine code.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e1f889a2e2.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e1f889a2e2.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>The next architecture I learned assembly for was the Parallax Propeller chip. I wrote a 4-channel wave table based audio driver in assembly. I put the chip in my 4-string bass with a NES controller as input. It was only until later that I experimented with Propeller machine code, only to find out that this architecture is the closest to 1-to-1 between assembly and machine code that I had ever seen. More on this project from the Bass + Computer post</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/ed6c0d4e5d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/ed6c0d4e5d.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I finally learn x86 assembly. I learn it from some SANS GREM (GIAC Reverse Engineering Malware) training that a previous employer sent me to. It was actually a fantastic intro to x86 assembly. It also offered/explained a tool that can be used to convert 'shellcode' into a real executable program under windows. I liked this, but really wanted one for GNU/Linux instead (one did not exist at the time)</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e8156f7929.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e8156f7929.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I then read more than 10 books on assembly and all 3 volumes (3,500 pages) of the Intel Manual.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/2215256248.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/2215256248.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>I learned that Assembly is too high level. I wont go into too much detail on the next 3 slides; as the deeper explanation of these topics is contained within this same blog (in other posts), and is enough for a dedicated talk...</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/5e827a6837.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/5e827a6837.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>My rant on responses I see on stackoverflow (not about the platform itself). Remember, I wanted something like shellcode2exe.py, but for GNU/Linux ELF. To see if there was anything like this, I started with a search, and found someone asked this question:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/024ffe9b00.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/024ffe9b00.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>This was the first moronic (and highest upvoted) answer. It is assembly (not machine code, like the question asked for):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/19e7f67353.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/19e7f67353.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>In the comment of that first answer was this (correct):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f56024b932.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f56024b932.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>This is probably the best answer, as it fully satisfies the question of having no headers (PE nor ELF). But there was no proof of concept :(</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/7e3e404373.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/7e3e404373.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Right above (with the moar shit), someone gives an 'example'. It is moar shit because the example is just more assembly (not machine code):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/3f1f727499.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/3f1f727499.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Then there are these unhelpful tidbits. Being that ELF is not machine code and a.out is not an appropriate alternative to ELF in the context of wanting to do pure machine code.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/eed0fe2970.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/eed0fe2970.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Finally, I create something for my own needs; writing pure machine code / 'shellcode' and being able to run it, albeit in ELF format. It takes a machine code (ascii hex) source file, and makes an ELF executable of it. I respond with my tool and a proof of concept:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/bf66d46c15.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/bf66d46c15.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>This is the closest thing to a helpful answer. Not only are we back to the DOS .COM format file (pure machine code, no headers), but there is a proof of concept; the fully functional and executable EICAR antivirus test file. But it wouldn't be stackoverflow if the most helpful answer wasn't the most downvoted and has the most ignorant responses. 'compiler' says that it doesn't look like machine code (it is). 'petersaints' states that this isn't machine code, and that it's just the EICAR test string (it is machine code, it also tests AV). For an in-depth debug of EICAR, see [thestarman.pcministry.com/asm/eicar...](http://thestarman.pcministry.com/asm/eicar/eicarcom.html) (it's elite). Also, my friend did a write-up on the same topic: [www.biebermalware.info/2016/05/p...](http://www.biebermalware.info/2016/05/playing-with-eicar-my-nerdiest-post-ever/)</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/8ede6d2cd6.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/8ede6d2cd6.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>So now starts the section where I give various ways to write raw machine code and execute it. Starting with the Windows platform and shellcode2exe.py. The screenshot itself shows how it is run:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/c889913905.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/c889913905.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>This ImmunityDBG screenshot is the output of the above shellcode2exe.py command. Note that I used assembly and machine code in the examples above about assembly being too high-level; hence a few ??? dissasemblies.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/5f8c0db883.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/5f8c0db883.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Below is a source file for my m2elf script. I have another blog entry that goes into more depth on these tools in the How to Machine post</a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f4bf84ee9d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f4bf84ee9d.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>A screenshot showing the running of the script and the executing of the result</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/0f27852c9c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/0f27852c9c.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>NASM directives are another way of inserting literal bytes into otherwise assembly source files. The advantage of this is that it allows for 64-bit code (my m2elf script only supports 32-bit). The thing to be aware of is the memory order model, as things can tend to get reversed if you're not paying attention.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/1e52da5f37.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/1e52da5f37.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Another method is to write boot sector code. The slide below outlines the features of coding this way. I wrote a PoC that I call TronSolitare, [github.com/XlogicX/t...](https://github.com/XlogicX/tronsolitare)</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/2fcd9924fb.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/2fcd9924fb.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>And to return a way to write raw machine code without headers, a method I could have used as a teenager, if only I had the right knowledge:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/09a8ea0d25.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/09a8ea0d25.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Cheatsheet on how to use DEBUG to write machine-code</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f02927b45a.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f02927b45a.jpg\" width=\"684\" height=\"491\"></a></p>\n<p> </p>\n<p>Commented machine code for Hello World (as used in live demo):</p>\n<blockquote><p>48656c6c6f20576f726c642124 #Hello World!<br>\nba 0001 #mov 0100 to dx<br>\nb4 09   #mov 9 to ah<br>\ncd 21    #API Call<br>\nb4 4c   #mov 4c into ah (exit)<br>\ncd 21   #API Call</p></blockquote>\n<p>I also wrote a program to interpret commented machine code (like above) and output a .COM file. I demo'd this as well during the talk. Even though this is assembly, I took the machine code from the assembled output and wrote this entire program using debug (in machine code), because I'm a purist...</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/58d06dba2d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/58d06dba2d.jpg\" width=\"418\" height=\"858\"></a></p>\n",
				
				"date_published": "2016-05-11T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/05/11/cactuscon-slides-machining-a-love.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/02/13/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Repetition of REP Instructions That Don't Repeat Anything",
				"content_html": "<p>The REP (Repeat String Operation) is a pretty cool prefix; It modifies a single string instruction to repeat until the ECX register reaches zero. As this only applies to one instruction (as apposed to a block of code), ECX needs a way to decrement, REP automatically decrements ECX by 1 each execute of the string operation instruction. So the idea is to set ECX to the amount of times you want the string operation to execute and the run the string operation with the REP prefix. The instructions that REP is supposed to be appropriate for are: INS, MOVS, OUTS, LODS, STOS, CMPS, and SCAS.</p>\n<p>Now that I described the coloring book, lets color outside the lines. What happens when we try to REP prefix an instruction not in this list. For example:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/6de7c3c6b9.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/6de7c3c6b9.jpg\" width=\"579\" height=\"146\"></a></p>\n<p>And this is in a debugger after executing the last INC.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/de34820760.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/de34820760.jpg\" width=\"562\" height=\"87\"></a></p>\n<p>You'll notice that ECX didn't decrement at all, and EAX is only up to '2'; as there were only 2 INC instructions. In other words, the REP prefix was completely ignored. I like this, because gratuitous prefixes can be abused...</p>\n<p>Consistent Instruction Sizes:</p>\n<p>Revisiting the concept discussed in the Consistent Instruction Sizes blog post, this can be done with an ironic repetition of REP instructions that don't repeat anything (best phrase ever):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/efa67119ec.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/efa67119ec.jpg\" width=\"556\" height=\"324\"></a></p>\n<p>Just like the previous blog post PoC, this one launches a /bin/sh shell as well. The thing I love so much more about this one is that the 0xF3 prefix (REP) doesn't really change the original meaning of the code (unless it is one of the very few string instructions). In comparison, the 0x66 and 0x67 override prefixes will change the register sizes all over the place and has to be treated carefully. And with the REP prefix, even if it was a string instruction, all you have to do is just make sure to set ECX to 1 before the instruction and it will work like a non-REP as well. But don't just do a normal 'XOR ECX, ECX' with machine code 0x31c9, you should do the full REP version of 0xf3f3f3f3f3f3f3f3f3f3f3f3f331c9</p>\n<p>Repeating a NOP to \"pause\":</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/18a12f5f48.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/18a12f5f48.jpg\" width=\"163\" height=\"109\"></a></p>\n<p>The above source file produces the below results in most debuggers/disassemblers.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9fe33194ec.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9fe33194ec.jpg\" width=\"154\" height=\"31\"></a></p>\n<p>So we take a normal NOP instruction:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f9e1ba7b2e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f9e1ba7b2e.jpg\" width=\"828\" height=\"26\"></a></p>\n<p>And put the REP prefix in front of it and get a pause:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d25697daa5.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d25697daa5.jpg\" width=\"849\" height=\"43\"></a></p>\n<p>It's really just a cool backwards compatibility hack on Intel's part. PAUSE was introduced with the Pentium 4. But if you used this instruction on an older Intel processor, what would it do. Well like we discovered with using a useless (non-string based) REP prefix, it wouldn't modify the instruction after it, in this case machine code for a NOP. In other words, machine code for a PAUSE on a pre Pentium 4 process would just be a glorified NOP.</p>\n",
				
				"date_published": "2016-02-13T09:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/02/13/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/01/24/080000.html",
				"title": "Assembly is Too High Level - Redundant Bit commands",
				"content_html": "<p>Compared to some of the most recent posts in this series, this one is a pretty basic example of a redundancy. This redundancy applies to the bit shifting instructions of: RCL, RCR, ROL, ROR, SAL, SAR, SHL, and SHR.</p>\n<p>These instructions can take an 8-bit immediate value, but there is also dedicated encoding for the operand to just be the value '1'. This is a very common operand for these type of instructions anyway, so it makes sense.</p>\n<p><a href=\"https://xlogicx.net/images/bits02.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/bd8e9745f3.jpg\" width=\"300\" height=\"172\"></a></p>\n<p>So the thing is, if we 'rcl eax, 1', there are two ways to encode this, but our assembler will only pick one of them. It sensibly picks the form where the encoding bakes in the implied '1' operand; the other form would require an extra byte of machine code (0x01) for the immediate value.</p>\n<p>Below is a screenshot of the collection:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e4d1cbc15d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e4d1cbc15d.jpg\" width=\"147\" height=\"450\"></a></p>\n",
				
				"date_published": "2016-01-24T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/01/24/080000.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/01/24/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Why ESP doesn't scale - But EBP can still Base",
				"content_html": "<p>The main 8 general purpose registers are EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI. In that order. You will see this structure in a lot of places. I will give some examples below, but it is in no way exhuastive; I just wanted to show some variety.</p>\n<p>There's the B0-B7 and B8-BF MOV instructions where the 2nd hex digit defines which register to receive an immediate value, notice that the registers are in the order described above.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9ba4769413.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9ba4769413.jpg\" width=\"182\" height=\"227\"></a></p>\n<p>Registers are also encoded like this in the ModR/M byte. If we wanted to XOR EBX with ECX, we would use the 0xCB byte<br>\n(click for larger view)</p>\n<p><a href=\"https://xlogicx.net/images/modrmchart01.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/8b95e4c407.jpg\" width=\"300\" height=\"247\"></a></p>\n<p>It's the SIB encoding where we start running into some interesting exceptions to this rule. In the format of [Base + Index * Scale + Displacement]: It appears that the Index can be any of the 8 general purpose registers with exception to ESP (the one that would be missing from those), and the Base can be any general purpose register with exception to EBP.</p>\n<p>Based on these exceptions, we should expect 'xor eax, [esp * 2]' to fail. This is true, there is no way to encode this into machine code and an assembler will give an error. What is interesting is that we CAN do something like this: xor eax, [ebp + eax * 2]. In this case, we are specifying EBP as the base (not allowed?) EAX as the Index with a Scale of 2, and \"no displacement.\" Let's look at the machine code that NASM chose to go with to make this work:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/7755de0bfc.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/7755de0bfc.jpg\" width=\"292\" height=\"20\"></a></p>\n<p>Let's work backwards here. 0x00 is not the SIB (otherwise it would mean [eax + eax]), 0x45 is actually the SIB. You can refer to Chart 1, but it appears to be using the [*] item for the Base and EAX * 2 for the Index * Scale. The [*] completely depends on which one of the 4 Mods was used in the ModR/M byte. In the highlighted \"NOTES\" section of Chart 1, you'll see that [*] could either mean just a 32 bit displacement, or it could also mean EBP + a 8-bit or 32-bit displacement. Only Mod 01 and 10 allow us to use EBP.</p>\n<p>(Chart 1, click to enlarge)</p>\n<p><a href=\"https://xlogicx.net/images/SIBchart031.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/c00b2b9e2e.jpg\" width=\"300\" height=\"280\"></a></p>\n<p>This actually starts to explain why we had our 0x00 show up at the end of our machine code, it looks like we are dealing with an 8-bit displacement. Look to Chart 2 for where the ModR/M byte (0x44) falls on this. The [--][--] means we are using a SIB byte. We also see that an 8-bit displacement was selected.</p>\n<p>(Chart 2, click to enlarge)</p>\n<p><a href=\"https://xlogicx.net/images/SIBchart032.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/1488dbe282.jpg\" width=\"300\" height=\"247\"></a></p>\n<p>Knowing this, we could just as well have used machine code of:<br>\n33 84 45 00 00 00 00</p>\n<p>For debuggers and dissasemblers that don't show the displacement when they are a zero value, both forms of this machine code look identical. For example, in Evans Debugger:</p>\n<p><a href=\"https://xlogicx.net/images/scales03.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/4222886fbf.jpg\" width=\"300\" height=\"26\"></a></p>\n",
				
				"date_published": "2016-01-24T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/01/24/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/01/19/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - SIB Doubles",
				"content_html": "<p>I'm finding that there is a full playground in the ModR/M encoding, and this post is specifically about a SIB obscurity, only because of the way I see NASM assembling some of my assembly. Then I found other cool things NASM puts up with</p>\n<p>Consider this code:<br>\n<a><img src=\"https://xlogicx.micro.blog/uploads/2026/sibs01.png\" width=\"168\" height=\"38\" alt=\"\"></a></p>\n<p>Functionally, they both result in the same thing. There is even seperate machine-code to accurately represent both (kind of). But if we assemble it, we end up with this:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/4426d5e304.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/4426d5e304.jpg\" width=\"260\" height=\"31\"></a></p>\n<p>Even though we can hand code the machine code to get this:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/9439e27bd6.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/9439e27bd6.jpg\" width=\"328\" height=\"14\"></a></p>\n<p>The SIB byte allows us to do complex memory/pointer syntax of [<strong>Base + Index*Scale + Displacement</strong>], where the <strong>Base</strong> is a general purpose register, <strong>Index</strong> is another general purpose register multiplied by 1, 2, 4, or 8 (the <strong>Scale</strong>), and the <strong>Displacement</strong> is an immediate value. Becuase of the exceptions found in the encoding table for the SIB byte, any one of the <strong>Base/Index/Displacement</strong> are optional. You can choose not to have an <strong>Displacement</strong>, which is actually defined in the ModR/M, not the SIB. You can not have an <strong>Index </strong>(see chart 1), because \"none\" is an option that takes over the option of ESP. You can also not have a <strong>Base </strong>(see chart 2), which is a special case that takes over the option of EBP, although this comes with a side-effect of having an offset (if none was selected with ModR/M).</p>\n<p>(Chart 1, No Index, Click to enlarge):</p>\n<p><a href=\"https://xlogicx.net/images/SIBchart01.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/2c0c1d52ca.jpg\" width=\"300\" height=\"242\"></a></p>\n<p> </p>\n<p>(Chart 2, No Scale, Click to enlarge):</p>\n<p><a href=\"https://xlogicx.net/images/SIBchart02.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/0de088b244.jpg\" width=\"300\" height=\"242\"></a></p>\n<p>That means that 'xor eax, [eax * 2]' isn't exactly a thing that exists. Because we are using the <strong>Index</strong> and no <strong>Base</strong>, we happen to be using that special case that requires an offset. So this is more accurately '<strong>xor eax, [eax * 2 + 0x00000000]</strong>'. It carries the same meaning, since we aren't arithmetically adding anything extra. But instead of interpretting '<strong>xor eax, [eax * 2]</strong>' like I just did, nasm does '<strong>xor eax, [eax + eax]</strong>'. This interpretation uses eax as the <strong>Base</strong>, and eax * 1 as the <strong>Index * Scale</strong>; this doesn't require an Offset.</p>\n<p>Now, if I said 'xor eax, [eax + eax * 2]', it would dutifully use my 'eax * 2' Scale.<br>\nSo there are at least 2 ways to represent [eax * 2]-ish. Nasm picks [eax + eax] because:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/95c1daf75a.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/95c1daf75a.jpg\" width=\"911\" height=\"58\"></a></p>\n<p>(from section 3.3 of the NASM manual)</p>\n<p>Is it better? It depends on how you look at it. It is better for smaller machine code, as in 'xor eax, eax' has shorter machine code than mov eax, 0 (but effectively has the same result). 'xor eax, [eax * 2 + 0x00000000]' takes more machine code. But then I bench-marked the two versions...</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/275fcaf50d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/275fcaf50d.jpg\" width=\"189\" height=\"478\"></a></p>\n<p>Testing on a 32-bit system, the xor eax, [eax * 2 + 0x00000000] form (<em>the one with MORE machine code</em>) tested <strong>consistently</strong> quicker (and I took a lot of samples). On my 64-bit system, the results were inconsistent; they both took about the same time but varied in which was the quickest.</p>\n<p>It's worth noting that you don't have to do any machine code hacking to use the faster (on some systems) version of this, just make sure to include the zero offset (xor eax, [eax * 2 + 0]), or you can even use a nasm feature to over-ride this choice: 'xor eax, [nosplit eax * 2]. And if you assemble my source files and wonder why they seg-fault, you have to hack the permissions of the .text section. It's actually easier than it sounds, just use a hex editor (I like vbindiff) and change the first 0x05 you see to a 0x07.</p>\n<p>As a final word on how awesome NASM is, you can do some super ignorant memory addresses, but so long as it is somehow equivilant to something valid, NASM will make attempts. For example, a scale of 5 or even subtraction! Consider:<br>\n<a><img src=\"https://xlogicx.micro.blog/uploads/2026/sibs04.png\" width=\"196\" height=\"34\" alt=\"\"></a></p>\n<p>NASM assembles it to:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/6bf774437c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/6bf774437c.jpg\" width=\"271\" height=\"30\"></a></p>\n",
				
				"date_published": "2016-01-19T09:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/01/19/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/01/11/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Load InEffective Address",
				"content_html": "<p><a href=\"https://xlogicx.micro.blog/uploads/2026/ae18b80dea.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/ae18b80dea.jpg\" width=\"852\" height=\"145\"></a></p>\n<p>The LEA (Load Effective Address) instruction allows us to copy the address of a memory location (in the memory addressing format you would find in ModR/M encoding) into a register. This instruction is also often used as a multiplication math hack used in place of MUL when LEA can be used instead. With the memory (pointer) encoding of the ModR/M byte (and the SIB) bytes, we are able to add 3 different numbers (two of them registers, one of them an immediate), and one of those numbers (one of the registers) can be multiplied by 2,4, or 8. For example, we could use LEA to compute (30 * 8) + 5 + 10. It would look like this:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/b59f0ea2dd.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/b59f0ea2dd.jpg\" width=\"605\" height=\"58\"></a></p>\n<p>The answer in hex is 0xff, which is 255 in decimal; the expected value.</p>\n<p>Those are some 'effective' ways to use LEA, but let's explore an impossible way to do it (there will be errors). Even though the Instruction format in the manual shows '<span style=\"color: #000000;\"><strong>LEA r32,m</strong></span>' (assuming the 32 bit version), it still shows the /r for the Opcode section. This implicitly means that we can encode for '<strong><span style=\"color: #000000;\">LEA r32, <span style=\"color: #ff0000;\">r32</span>/m</span></strong>'; both operands can be encoded as a register. For this LEA instruction to work how it's supposed to, the source operand (2nd one) needs to be a memory address (not a register).</p>\n<p>As you probably can guess by the theme of any of these \"Assembly is Too High Level\" posts, we probably can't do this in assembly...</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/77071eae3e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/77071eae3e.jpg\" width=\"487\" height=\"136\"></a></p>\n<p>Not that it will be useful (unless you like having errors thrown in obscure ways), but we will encode the above instruction directly in machine code. Running this instruction will cause an error. The edb debugger shows the 2nd operand as invalid (astute observation) and then you get an illegal instruction fault (true).</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/c22e11cd80.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/c22e11cd80.jpg\" width=\"500\" height=\"181\"></a></p>\n<p> </p>\n<p>objdump shows a more interesting output, it splits the operand byte off and interprets it as its own instruction (rolb).</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/43c25afe1b.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/43c25afe1b.jpg\" width=\"467\" height=\"188\"></a></p>\n<p>Either way you cut it, as advertised, it's not very effective.</p>\n",
				
				"date_published": "2016-01-11T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/01/11/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2016/01/05/assembly-is-too-highlevel-test.html",
				"title": "Assembly is Too High-Level - TEST r32, r-m32, exists in assembly, but not the machine",
				"content_html": "<p>...And the TEST r32, r/m32 that exists in assembly is more just kind of a lie...</p>\n<p>An interesting thing about instructions that use the ModR/M encoding is that both the source and destination operands can be a register, but not both can be a memory location. When it comes to the registers, this has been the source of a lot of cool redundancies. This post is about a cool memory encoding redundancy though. As you'll note below, it usually takes two different machine encodings of the CMP instruction to use a 32-bit memory location for both the source or destination operands.</p>\n<p><a><img src=\"https://xlogicx.micro.blog/uploads/2026/testmod01.png\" width=\"600\" height=\"18\" alt=\"\"></a> \n<a><img src=\"https://xlogicx.micro.blog/uploads/2026/testmod02.png\" width=\"600\" height=\"18\" alt=\"\"></a></p>\n<p> </p>\n<p>This is what the instructions would look like in assembly:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/8f89ff26bf.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/8f89ff26bf.jpg\" width=\"114\" height=\"35\"></a></p>\n<p>This is me looking at them in a debugger:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/7c0bb8a4da.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/7c0bb8a4da.jpg\" width=\"214\" height=\"29\"></a></p>\n<p>You'll see both 0x39 and 0x3b encodings in the debugger screenshot (although out of order, that's just because that's the order I wrote them in the assembly source file).</p>\n<p>Everything looks consistent so far. But now let's look at test...</p>\n<p>Assembly:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/1f5b45ff44.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/1f5b45ff44.jpg\" width=\"125\" height=\"34\"></a></p>\n<p>From Debugger:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/cffb366b41.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/cffb366b41.jpg\" width=\"221\" height=\"28\"></a></p>\n<p> </p>\n<p>What? It's literally the same machine code for both instructions. Why is this? This actually isn't something stupid, but something fairly smart that Intel did. First I'll note that the Intel manual has an entry for this form of TEST:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e4604283ec.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e4604283ec.jpg\" width=\"829\" height=\"41\"></a></p>\n<p> </p>\n<p>However, they do not have a corresponding entry for a <strong>'r32, r/m32'</strong> form. But really, is there any reason we need that other form? The result of <strong>ANDing 0x55 by 0xAA</strong> is the same as <strong>ANDing 0xAA by 0x55</strong> (the result is 0x00 for both). This is not the case with CMP, where subtraction is implicitly done, and order of operands would change the result. There are only so many one-byte opcodes to go around, and Intel saves one of them by not making an encoding for the completely redundant ModR/M encoding.</p>\n",
				
				"date_published": "2016-01-05T09:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2016/01/05/assembly-is-too-highlevel-test.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/12/30/assembly-is-too-highlevel-bswapin.html",
				"title": "Assembly is Too High-Level - BSWAPin 16-bit Registers",
				"content_html": "<p><a href=\"https://xlogicx.net/images/bswap.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f0d0512026.jpg\" width=\"734\" height=\"1024\"></a></p>\n<p> </p>\n<h3></h3>\n<h3>But what actually happens?</h3>\n<p>As it turns out, the Intel manual is correct in stating that you should use xchg instead of bswap. In practice, it's hard to say the result of this 16-bit bswap is 'undefined;' as it is consistent with what it does each time. Instead of swapping the contents of ah and al 8-bit registers within ax, it actually just clears the register to 0x00. I tried a lot of different values to test this, and it always zero's it out. I also tried fully loading up eax, and bswap'ing any value for ax just cleared the ax part, leaving the upper part of eax intact.</p>\n<p>So in practice: bswap reg16 = xor reg16, reg16 = mov reg16, 0 (were both reg16's are the same register in the xor)</p>\n",
				
				"date_published": "2015-12-30T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/12/30/assembly-is-too-highlevel-bswapin.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/12/19/assembly-is-too-highlevel-self.html",
				"title": "Assembly is Too High-Level - Self Modifying Code with Basic Arithmetic",
				"content_html": "<p>I should say that we are able to do this trick all in assembly, but none of it would make sense without an understanding of machine code.</p>\n<p>This post is about simple self-modifying code tricks you can do with addition and subtraction to an instruction to make it another instruction, while also maintaining consistent addressing modes and operands (I.E. adding 8 to the 2nd byte of machine code of 'add bl, 5' would become 'or bl, 5').</p>\n<p>This trick is easy for many instructions that share the same first opcode byte. Below is a simple example showing INC and DEC:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/66dde23639.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/66dde23639.jpg\" width=\"567\" height=\"19\"></a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/85811bf88c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/85811bf88c.jpg\" width=\"557\" height=\"19\"></a></p>\n<p>The /0 and /1 is referring to bits 3-5 of the byte following FE. So if these bits happen to be 000, then it's INC. If the bits are 001, then it's DEC. There's a more detailed breakdown of how this all works in my previous post on redundancies in the \"<a href=\"http://xlogicx.net/?p=249\">OP REG, imm</a>\" instruction format. Here are the 2 similar instructions in edb:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d578de19d0.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d578de19d0.jpg\" width=\"96\" height=\"30\"></a></p>\n<p>below is the binary representation of those instructions converted from the hex to binary, with the 'instruction' 3 bits highlighted for clarity:</p>\n<p>11111110 11<span style=\"color: #ff0000;\"><em><strong>000</strong></em></span>000<br>\n11111110 11<span style=\"color: #ff0000;\"><em><strong>001</strong></em></span>000</p>\n<p>Let's take a look at another INC/DEC example:</p>\n<p> </p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/51416a790e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/51416a790e.jpg\" width=\"325\" height=\"31\"></a></p>\n<p>Even though one instruction is an INC, and the other is a DEC, everything else about the assembly is the same (both to a memory pointer using rcx + the same offset). The only difference is in the 2nd byte of the machine code. You could literally add 8 to that 2nd byte, and turn an INC into a DEC with everything else being the same.</p>\n<p>All of these shared-byte instructions are like this, they are +/- away from the next one. So adding 8 to the 2nd byte of a specific type of ADD will render an OR. You add 8 more and get an ADC, again gets SBB, then AND, then SUB, then XOR, and finally CMP. This also means that adding say 32 to the 2nd byte of the SBB would result in a CMP.</p>\n<p>Here is a video proof of concept of abusing this idea to xor 0x55 with 0xaa while heavily obfuscating that we are actually xoring:</p>\n<p></p><p>[www.youtube.com/watch](https://www.youtube.com/watch?v=iETt65Cnobc)</p><br>\n \n<p>If you aren't in the mood for video, here is a before and after run of the 4 instructions:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/85e902100e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/85e902100e.jpg\" width=\"716\" height=\"72\"></a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/dd75c56f00.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/dd75c56f00.jpg\" width=\"714\" height=\"72\"></a></p>\n<p>And here is the assembly source:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/616ce45ca9.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/616ce45ca9.jpg\" width=\"251\" height=\"138\"></a></p>\n<p>Appendix:</p>\n<p>If you would like to know all of the instructions that you can jump around in with simple maths modifications, here they are:</p>\n<p>0x80 - 1-byte operand: ADD, OR, ADC, SBB, AND, SUB, XOR, CMP<br>\n0x81 - 4-byte operand: ADD, OR, ADC, SBB, AND, SUB, XOR, CMP<br>\n0x82 - 1-byte operand: ADD, OR, ADC, SBB, AND, SUB, XOR, CMP<br>\n0x83 - 1-byte operand: ADD, OR, ADC, SBB, AND, SUB, XOR, CMP<br>\n0xC0 - 1-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xC1 - 1-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xD0 - 0-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xD1 - 0-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xD2 - 0-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xD3 - 0-byte operand: ROL, ROR, RCL, RCR, SHL, SHR, SAL, SAR<br>\n0xF6 - 0-byte operand: *TEST, *TEST, NOT, NEG, MUL, IMUL, DIV, IDIV<br>\n0xF7 - 0-byte operand: **TEST, **TEST, NOT, NEG, MUL, IMUL, DIV, IDIV<br>\n0xFE - 0-byte operand: INC, DEC<br>\n0xFF - 0-byte operand: INC, DEC</p>\n<p>*operand is actually 1-byte<br>\n**operand is actually 4-bytes</p>\n<p>Adding or subtracting from the 2nd byte of the resulting machine code puts the instructions in each individual list a value of 8 from each other. So if I wanted to convert a CMP to an ADC, that's 5 instructions back * the 8. So subtracting 40 from one of those CMP instructions would effectively make it an ADC with everything else intact. The only exception is that the TEST instructions have different operand sizes then the rest in the list (they are not convertable).</p>\n<p>Also, it may go without saying, but I'm going to say it anyway: this is self-modifying code, therefore, wherever this code is running from, it needs to be rwx. Default from assemblers is r-x. Personally, I manually changed the 05 byte of the permissions area of the p_flags item of the ELF header.</p>\n",
				
				"date_published": "2015-12-19T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/12/19/assembly-is-too-highlevel-self.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/11/25/followup-on-creating-vmwdqy.html",
				"title": "Follow-up on creating Vm0wd2Qy - 9000",
				"content_html": "<p><a href=\"https://xlogicx.micro.blog/uploads/2026/fa304e7293.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/fa304e7293.jpg\" width=\"550\" height=\"81\"></a></p>\n<p> </p>\n<p>This is referring to my previous post on Vm0wd2Qy, and a clarification of how I got my results. If you repeatedly Base64 encode a string, you will eventually get Vm0wd2Qy as the first part of your string. In my previous post, I have 10,000 characters that you would eventually get as the first part of your string if you keep doing this. My process for getting these 10,000 characters involved a kind of brute force, but with the obvious assistance of scripting and cli stuff.</p>\n<p> </p>\n<h3>Experimenting:</h3>\n<p>Lets just get Vm0 part of Vm0wd2Qy real quick, form a starting string of 'lol'. We do this by encoding 'lol', then encoding the result of that, and repeating:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/a43a5439e1.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/a43a5439e1.jpg\" width=\"966\" height=\"311\"></a></p>\n<p>In our last result, we do see 'Vm0'; we see our magic string starting to form. But as far as brute forcing this goes, I hope you see the problem: how do we know where the magic string ends, and where all of this extra baggage starts? Let's also consider another example of making 8 base64 passes with a different starting 'seed' than 'lol'.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/25fdf9f6d9.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/25fdf9f6d9.jpg\" width=\"967\" height=\"290\"></a></p>\n<p> </p>\n<p>If you just look at the beginning and end of the final strings from each seed, it looks like they end up being exactly the same. But a closer looks shows that this is not so:</p>\n<p> </p>\n<p>Vm0xd1IyRXd<span style=\"color: #ff0000;\"><strong>O</strong></span>VWhT<strong><span style=\"color: #ff0000;\">V0doWVYwZDRWVll</span></strong>3WkZOV2JHeHpWV3hrVm1KR2NIbFdWM1JQVmxaS1ZrNVlWbFpTYkVZelZUTmtkbEJSYnowSwo=</p>\n<p>Vm0xd1IyRXd<strong><span style=\"color: #ff0000;\">N</span></strong>VWhT<strong><span style=\"color: #ff0000;\">YmtwUFZtMW9WbFl</span></strong>3WkZOV2JHeHpWV3hrVm1KR2NIbFdWM1JQVmxaS1ZrNVlWbFpTYkVZelZUTmtkbE4zYnowSwo=</p>\n<p>It looks like they both start with \"Vm0xd1IyRXd,\" But that doesn't mean \"Vm0xd1IyRXd\" is the magic string. If only it were that easy.</p>\n<p> </p>\n<h3>Scripting:</h3>\n<p>Instead of working smart though, I worked brute. The scripted idea was to run the string through many passes, with all possible 1 character seeds (64 of them), only output the first 'x' characters, and hope for 'uniq -c' on the result to only return one value. I used all seeds to hopefully avoid the above problem of having non-unique yet invalid magic string properties (no promises that this guarantees the right result, however, just a best effort).</p>\n<p>Here's an example of the script I was working with:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/8b1d06810b.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/8b1d06810b.jpg\" width=\"649\" height=\"357\"></a></p>\n<p>When run, here's a screenshot of the tail end of the output:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/31c335c4cd.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/31c335c4cd.jpg\" width=\"614\" height=\"513\"></a></p>\n<p>the uniq command would indeed give me one entry for each line (not what I want). Here's the output I would get from uniq if I changed sprint(\"%.10000s\".... to sprintf(\"%.3s\"... (this line is the 'x' characters cutoff piece).</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/835cffd49b.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/835cffd49b.jpg\" width=\"422\" height=\"35\"></a></p>\n<p>This is the kind of output that I'm looking for.</p>\n<p>If I want to get a large (over 9000) string, I need to set the sprintf value to truncate at the goal value of %.9000s (I went for 10000). if uniq doesn't come back with just 1 result, then I need to ramp up my $i variable (this variable defines how many base64 passes to make for each starting seed). On the flip side, if I preferred not to make any more base64 passes and just wanted to know how many non-unique characters I had for all 64 strings, I could bring the sprintf number down until uniq -c gave just one result.</p>\n<p>There is a limit to how high you can set $i though; your shell would eventually throw a fit. Actually, without 'use warnings;', you would just get the wrong result. Otherwise, you would get something like \"Can't exec \"/bin/sh\": Argument list too long at base64.pl line...\" with a value of $i being too high. Because of this limit, I wasn't able to get much higher than 10,000 non-unique characters in my string without having to do something like recompile the kernel. I forget how high I made $i, but it was one number less than what would give me an error.</p>\n",
				
				"date_published": "2015-11-25T09:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/11/25/followup-on-creating-vmwdqy.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/11/08/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Jump Near When Short",
				"content_html": "<p>This article is about a redundancy with short and near jumps. Both of the jumps I will talk about are relative; the immediate data after the jump instruction is a signed offset for how far to jump. The difference between a short and near jump is simple: the 0xeb short jump has a byte for its operand, and the 0xe9 near jump has 4 bytes for it's operand. This means that we can jump -128 through 127 bytes with a short jump and -2,147,483,648 through 2,147,483,647 bytes with a near jump.</p>\n<h3><strong>Dynamic Assembler Behavior:</strong></h3>\n<p>First of all, the assembler I use (nasm) will try and guess which type of jump to use if you don't specify short or near. So if I'm jumping farther than 128 bytes, it will use a near jump for me, and if less than that, a short jump. At a low level, I find this interesting; that the same assembly instruction can have different machine code depending on how far away (in bytes) the label operand is. Of course, different machine code for consistent assembly isn't weird in general, an addressing mode can quickly change which machine code to use. I still find this behavior for the jump instruction interesting though.</p>\n<h3><strong>The Redundancy:</strong></h3>\n<p>Say we we're only jumping forward 127 bytes, but for some reason we wanted to use the near jump (0xe9) format. By default, nasm would use a short (0xeb) jump. The short jump range is still a valid range in near jumps (speaking for machine-code encoding). Fortunately, we don't need to do machine code hacks in order to over-ride this; nasm allows us to use 'short' and 'near' arguments to the jmp instruction (to over-ride the default).</p>\n<h3><strong>PoC:</strong></h3>\n<p>Here is a source file. the jmp jumps to a label 128 bytes forward, which will require a near jump.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/b180b700dc.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/b180b700dc.jpg\" width=\"213\" height=\"459\"></a></p>\n<p>Here is a debugger screenshot of the machine-code used for the 'jmp.'</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/90402abbd4.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/90402abbd4.jpg\" width=\"387\" height=\"18\"></a></p>\n<p>Now let's move our 'end:' label by 1-byte back (to jump 127 bytes, of which a short jump can do):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/0bd0a74b12.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/0bd0a74b12.jpg\" width=\"213\" height=\"461\"></a></p>\n<p>Here's the debugger output:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/f728be517e.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/f728be517e.jpg\" width=\"330\" height=\"19\"></a></p>\n<p>So let's over-ride this short jump (we still only want to jump 127 bytes, but with the long 4-byte operand encoding of a near jump):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e24d8559b7.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e24d8559b7.jpg\" width=\"214\" height=\"462\"></a></p>\n<p>And our result (in debugger):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/574b3c2788.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/574b3c2788.jpg\" width=\"387\" height=\"19\"></a></p>\n<p> </p>\n<p> </p>\n",
				
				"date_published": "2015-11-08T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/11/08/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/11/06/vmwdqy.html",
				"title": "Vm0wd2Qy",
				"content_html": "<p>This article will be about 2 encodings/interpretations of binary; ASCII and Base64. ASCII is an 8-bit (debatable I know, but go with it) character encoding, and Base64 is a 6-bit character encoding. Conversions can be made between them:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/4ded91c0c5.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/4ded91c0c5.jpg\" width=\"640\" height=\"37\"></a></p>\n<p>Not too interesting. This post is more about a string like \"Vm0wd2Qy\":</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/c4314d2b89.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/c4314d2b89.jpg\" width=\"595\" height=\"37\"></a></p>\n<p><span style=\"color: #808080;\">[2nd command is just there to add a graceful newline]</span> The decoded version is smaller, but that's it, this transformation was not as dramatic as the first example. How can this be?</p>\n<p>Consider that it is possible for the same data to 'mean' the same thing in two different encodings at the same time.</p>\n<p>Let's look at the binary encodings for ASCII and Base64. We wont do any hexidecimal or decimal conversions or any unnecessary stuff like that; we will just look at what binary produces which printable characters for both systems. I am only going to show the binary encodings of A-Za-z0-9+/ for ASCII, and for Base64 (which happens to be the full base64 character set). To be clear, there are more characters (printable and not) in the ASCII character set. For a full reference on both ASCII and Base64, the Wikipedia pages are more than enough.</p>\n<p>This table represents both systems, with the printable character in the middle, and the binary that represents the character for each base. It's not a fancy chart, but it get's the job done:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/36d2a548d6.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/36d2a548d6.jpg\" width=\"176\" height=\"1104\"></a></p>\n<p> </p>\n<h3><strong>Some Toy Conversions:</strong></h3>\n<p>Converting is fairly straight forward, although you have to be mindful that one system is 8-bit and the other is 6-bit. Let's convert the first 4 base64 characters in the example that this post starts with.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/242bfdea55.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/242bfdea55.jpg\" width=\"606\" height=\"194\"></a></p>\n<p>The top (red) is the Base64, and the bottom (blue) is the ASCII. It is the same binary for each. You can look at the conversion table above to see that this binary directly corresponds to the printable characters they are supposed to.</p>\n<h3></h3>\n<h3><strong>Now, Vm0wd2Qy:</strong></h3>\n<p><a href=\"https://xlogicx.net/images/conversions3.png\"><img src=\"https://xlogicx.micro.blog/uploads/2026/65302f64d0.jpg\" width=\"1024\" height=\"171\"></a></p>\n<p><strong><span style=\"color: #ff0000;\">This kind of black magic excites me :).</span></strong></p>\n<p>If you decode the (red) Base64, you get the same data, just less characters, because of the 4:3 ratio. If you take the ASCII results (still also valid Base64 characters), and treat it as Base64 and decode it, the same pattern emerges. (in this case, Vm0wd2 would decode to Vm0w). Eventually, we are reduced to nothing.</p>\n<h3></h3>\n<h3><strong>Over 9000 digits of this:</strong></h3>\n<p>So Vm0wd2Qy is cool, but let's look at an epic 'recursive' Base64 string of 10,000 characters. Same goes for this, you can keep doing 'base64 -d', the string doesn't change, it just shrinks. Even in 10,000 characters, there's no infinitely repeating patterns. That said, the entropy is also very low. Out of all these characters, there are 975 'V's (10%), 556 'W's, 1 'g' and no 'f's, '7's, '+'s or '/'s. There are so many other interesting patterns that emerge when playing with even the simplest of formal systems, but I feel I've already awarded myself with an informal autism diagnosis for this post, so without further delay, over 9,000 of these beautiful characters:</p>\n<p><span style=\"color: #ff6600;\">Vm0wd2QyUXlVWGxWV0d4V1YwZDRWMVl3WkRSV01WbDNXa1JTVjAxV2JETlhhMUpUVmpBeFYySkVUbGhoTVVwVVZtcEJlRll5U2tWVWJHaG9UVlZ3VlZadGNFSmxSbGw1VTJ0V1ZXSkhhRzlVVmxaM1ZsWmFkR05GU214U2JHdzFWVEowVjFaWFNraGhSemxWVm14YU0xWnNXbUZrUjA1R1UyMTRVMkpIZHpGV1ZFb3dWakZhV0ZOcmFHaFNlbXhXVm0xNFlVMHhXbk5YYlVaclVqQTFSMVV5TVRSVk1rcElaSHBHVjFaRmIzZFdha1poVjBaT2NtRkhhRk5sYlhoWFZtMHhORmxWTUhoWGJrNVlZbFZhY2xWcVFURlNNVlY1VFZSU1ZrMXJjRWxhU0hCSFZqRmFSbUl6WkZkaGExcG9WakJhVDJOdFJraGhSazVzWWxob1dGWnRNSGhPUm14V1RVaG9XR0pyTlZsWmJGWmhZMnhXY1ZGVVJsTk5WbFkxVkZaU1UxWnJNWEpqUld4aFUwaENTRlpxUm1GU2JVbDZXa1prYUdFeGNHOVdha0poVkRKT2RGSnJhR2hTYXpWeldXeG9iMWRHV25STlNHaFBVbTE0VjFSVmFHOVhSMHB5VGxac1dtSkdXbWhaTW5oWFkxWkdWVkpzVGs1V2JGa3hWa1phVTFVeFduSk5XRXBxVWxkNGFGVXdhRU5UUmxweFVtMUdVMkpWYkRaWGExcHJZVWRGZUdOSE9WZGhhMHBvVmtSS1QyUkdTbkpoUjJoVFlYcFdlbGRYZUc5aU1XUkhWMjVTVGxOSGFGQlZiVEUwVmpGU1ZtRkhPVmhTTUhCNVZHeGFjMWR0U2tkWGJXaGFUVzVvV0ZreFdrZFdWa3B6VkdzMVYySkdhM2hXYTFwaFZURlZlRmR1U2s1WFJYQnhWV3hrTkdGR1ZYZGhSVTVVVW14d2VGVnRNVWRWTWtwV1lrUmFXR0V4Y0hKWlZXUkdaVWRPU0U5V1pHaGhNSEJ2Vm10U1MxUXlVa2RUYmtwb1VqSm9WRmxZY0ZkbGJHUllaVWM1YVUxWFVraFdNalZUVkd4T1NHRkdRbFppVkVVd1ZtcEdVMVp0UmtoUFZtaFRUVWhDTlZaSGVHRmpNV1IwVTJ0a1dHSlhhR0ZVVnpWdlYwWnJlRmRyWkZkV2EzQjZWa2R6TVZZeVNrZGhNMmhYWVRGd2FGWlVSbFpsUm1SMVUyczFXRkpZUW5oV1YzaHJUa2RHUjFaWVpHaFNWVFZWVlcxNGQyVkdWblJOVldSV1RXdHdWMWxyVW1GWFIwVjRZMGhLV2xaWFVrZGFWV1JQVTBVNVYxcEhhR2hOU0VKMlZtMTBVMU14VVhsVmEyUlVZbXR3YjFWcVNtOVdSbXhaWTBaa2JHSkhVbGxhVldNMVlWVXhXRlZyYUZkTmFsWlVWa2Q0VDFOSFJrZFJiRnBwVmtWVmQxWnRjRWRWTVZwMFVtdG9VRlp0YUZSVVZXaERUbFphU0dWSFJtcE5WMUl3VlRKMGExZEhTbGhoUjBaVlZucFdkbFl3V25KbFJtUnlXa1prVjJFelFqWldhMlI2VFZaWmVWTnJaR2hOTW1oWVdWUkdkMkZHV2xWU2JGcHNVbTFTTVZVeWN6RlhSa3BaVVc1b1YxWXphSEpVYTJSSFVqRmFXVnBIYUZOV1ZGWldWbGN4TkdReVZrZFdibEpPVmxkU1YxUlhkSGRXTVd4eVZXMUdXRkl3VmpSWk1HaExWMnhhV0ZWclpHRldWMUpRVlRCVk5WWXhjRWhoUjJoT1UwVktNbFp0TVRCVk1VMTRWVmhzVm1FeVVsVlpiWFIzWWpGV2NWTnRPVmRTYlhoYVdUQmFhMkpIU2toVmJHeGhWbGROTVZsV1ZYaFhSbFp5WVVaa1RtRnNXbFZXYTJRMFZERk9TRkpyWkZKaVJuQndWbXRXVm1ReFduUmpSV1JXVFZad01GVnRkRzlWUmxwMFlVWlNWVlpYYUVSVWJGcGhVMGRXU0ZKdGNFNVdNVWwzVmxSS01HRXhaRWhUYkdob1VqQmFWbFp1Y0Zka2JGbDNWMjVLYkZKdFVubFhhMXByVmpKRmVsRnFXbGRoTWxJMlZGWmFXbVZXVG5KYVIyaE9UVzFvV1ZkV1VrZGtNa1pIVjJ4V1UySkdjSE5WYlRGVFRWWlZlV042UmxoU2EzQmFWVmMxYjFZeFdYcGhTRXBWWVRKU1NGVnFSbUZYVm5CSVlVWk9WMVpHV2xkV2JHTjRUa2RSZVZaclpGZGliRXBQVm14a1UxWXhVbGhrU0dSWFRWZDRlVlpYTVVkWFJrbDNWbXBTV2sxSGFFeFdNbmhoVjBaV2NscEhSbGRXTVVwUlZsUkNWazVXV1hoalJXaG9VakpvVDFVd1ZrdE5iRnAwVFZSQ1ZrMVZNVFJXVm1oelZtMUZlVlZzVmxwaVdGSXpXV3BHVjJOV1RuUlBWbVJUWWxob1lWZFVRbUZoTWtwSVUydG9WbUpIZUdoV2JHUk9UVlpzVjFaWWFGaFNiRnA1V1ZWYWExUnRSbk5YYkZaWFlUSlJNRlpFUms5VFJrcHlXa1pLYVZKdVFuZFdiWFJYVm0xUmVGZHVVbXBTVjFKWFZGWmFkMDFHVm5Sa1J6bFdVbXh3TUZsVldsTldWbHBZWVVWU1ZXSkdjR2hWTUdSWFUwWktkR05GTlZkTlZXd3pWbXhTUzAxSFJYaGFSV2hVWWtkb2IxVnFRbUZXYkZwMVkwWmthMkpHYkROV01qVkxZa1pLZEZWdWJGaGhNWEJ5Vm1wS1JtVnNSbkZYYkdSb1RXeEpNbFpHV21GWGJWWlhWRzVLWVZJeWFFOVVWekZ2VjFaa1YxVnJaR3ROYTFwSVZqSjRWMVV5U2tkalNFNVdZbFJHVkZSV1dsWmxWMDQyVW14b1UyRXpRbUZXVm1NeFlqRlplRmRZY0doVFJYQldXVlJLVTFOR1ZuRlNiVVpZVm01Q1NWbFZXazlXTVZwSFYyeGtWMkpIVGpSVWEyUlNaVlphY2xwR1pHbGlSWEJRVm0xNGExVXhXWGhWYkdoclUwZFNXRlJXWkRSbFZscFlUVlZrV0ZKcmJETldiWEJUVjJzeFNHRkZlRmROYm1ob1ZqQmFWMk5zY0VoU2JHUlhUVlZ3VWxac1VrTldhelZYVjFob2FsSlhhRzlWYWtwdlZERlZkMVpyZEU1aVJuQXdWRlpTUTFack1WWk5WRkpYVm0xb2VsWnRNVVpsVmxaelZteHdhVmRHU1hwWFYzQkhWakpPVjFSdVVsQldiVkpVV1d4b2IxbFdaRlZSYlVab1RXdHdTVlV5ZEc5V2JVcElaVWRvVjJKSFVrOVVWbHB6VmpGYVdXRkdhRk5pUm5BMVYxWldZV0V4VW5SU2JrNVlZa1phV0ZsVVNsSk5SbHBGVW1zNVZGSnJjSGxYYTFwTFlWWktkVkZ1WkZkaVdGSllWbTB4VW1WR1pIVlZiWEJUVmpGS1dGWkdXbUZrTURGSFZtNVNhMUo2YkZkVmJYaDNUVVpzVmxkc1RsZFdiSEJaV1ZWV1UxWlhTa2RqUjJoV1RVZFNXRlV3V2t0a1IwNUdUbFprVGxaWGQzcFdiWGhUVXpBeFNGSllhR0ZTVjJoVldXdGtiMkl4Vm5GUmJVWlhZa1p3TVZrd1dtdGhNa3BIWWtST1YwMXFWa3haYTFwTFpFWldkV0pHYUdoTldFSjVWbTF3UzFKdFZuTlNia1pZWWtkU2IxUlhlRXBOYkZwSFYyMUdXR0pXV2xoV1J6VkxXVlpKZVdGRk9WVldla1oyVmpGYWExWXhWbkphUjNST1lURndTVlpxU2pSV01WVjVVMnRrYWxORk5WZFpiRkpIVmtaU1YxZHNXbXhXTURReVZXMTRiMVV5UlhwUmJVWlhWbTFOZUZscVJscGxSbVJaWTBkb1ZGSllRbGRYVmxKTFZURk9SMVp1UmxOaVZWcFpWbTAxUTFOV2JGWlhhemxYVFZad1NGWXllR3RXTWtwSVZHcFNWV0V5VWxOYVZscGhZMnh3UjFwSGJHbFNXRUpSVm0weE5HRXhWWGhYV0doV1lrZG9jbFV3WkZOWFJsSlhXa1JTYkZac2NGbFVWbFpyVjBaS2NtSkVUbGRpV0VKVVZqSnplRk5IUmtabFJtUk9ZbTFvYjFacVFtRldNazV6WTBWb1UySkhVbGhVVmxaM1ZXeGFjMVZyVGxkaGVsWllWakZvYjJGc1NsaGhSemxXWVd0d2RsWkVSbFprTVZweVpFVTFhVkp1UVhkV1JscFRVVEZhY2sxV1drNVdSa3BZVm0weGIyVnNXblJOVlZwc1ZteGFlbFp0ZUhkaFZtUkhVMWh3V0ZZelFraFdha3BQVmpGU2RWVnNRbGRpVmtwVlYxZDRiMkl4YkZkYVJsWlNZbFZhYjFSWGRIZFRWbFY1WkVjNVYySlZjRWxhVldSdlZqSktTRlZyT1ZWV2JIQjZWbXBHWVZkWFJrZGhSazVwVW01Qk1WWXhXbGRaVjBWNFZXNVNVMkpyTlZsWmExcGhWMFpzVlZOc1NrNVNiSEJHVlcxek5WWkdTbkpqU0d4WVlURndURmxXV2t0a1ZrWjFXa1prYVZkR1NsbFdiVEI0VlcxV1IxcElWbFppV0VKVVZtcEdTMWRHV2tkWGJVWnJUVVJHU0ZadE5WTmhNVW8yWWtaa1ZtSllhRE5VVlZwelZteGtjMVJzWkZkaVNFSmFWMVpXVjFVeFdsaFRhMlJxVWpKb1YxUlZaRk5YUmxWM1YydDBhMUl3TlVkVWJGcHJWR3hhV0dRemNGZGlXR2hVVlhwQmVGTkdTbGxoUjBaVFZqSm9WbGRYZEd0aU1rbDRWbTVHVW1KVldsaFphMXAzVFZacmQxZHRkR2hOYTNCSVdXdFNUMVl3TVhGV2EzaGFZVEZ3VEZwRldsZGtWMHBIWVVkb1RsZEZTalZXYlRGM1V6RktkRlp1VGxOaWExcFpXV3RrVTJJeFVsaGxSWEJPWWtad1NGWXlNVWRYUjBwWFYyeG9WMUo2Vmt4WFZscEtaVmRHUms5V1ZsZGxhMW95Vm1wR1lXRXhXWGhXYmtwaFVqTlNUMWxVUm5kVFZscHhVMnBTVjAxV1ZqVlZNblJyWVd4T1JrNVdaRnBpUmtwSVZtdGFVMVl4WkhOWGJYaFhUVVJSZVZaWE1UUmlNVlY1VWxod1VtSlZXbGhXYlRGT1pVWnNjVkpzY0d4U2JWSmFXVEJrYjFaR1NsbFJiR1JZVm14S1RGWlVSazlTTVZwMVVteE9hVlpXY0ZwV2JUQXhVVEZPVjJKR1dsaGhlbXhZVkZaYWQxTkdXWGxsUjNSWFRXdHdTVlpIY0ZOV1YwVjVWV3hPWVZac2NHaFpNbmgzVWpGd1NHSkhiRk5YUlVwU1ZtMHdlRTVIUlhoV1dHaFlWMGRvVjFsclpHOWpiRlYzV2taT1dGSnNTbGhYYTFKVFlrWmFjMk5HYkZWV2JGcHlWakJhUzJOdFNrVlViR1JwVjBWS1ZWWnFTbnBsUms1SVVtdGFiRkp0VWs5WmJURnZZakZhY1ZGdFJsZE5helY2V1RCV2IxVXlTa2hWYkdoVlZteGFNMVpYZUdGak1WWnlWR3hrYUdWc1dsbFdha1p2WWpKR2MxTnNhR2hTZW14WFdWZDBkMlJzV2tWU2JGcHJUVlp3ZVZwRlZURmhWa3AxVVZoa1YxSnNjRlJWVkVaaFkyc3hWMWRyTlZkU2EzQlpWbTB3ZUdJeVZuTlhibEpPVmxad2MxWnRlR0ZsYkZwMFpVaGthRlp0VWtoVk1XaDNWMFpaZWxGcmFGZGhhM0JRVm1wR1UyUldWbk5SYkdScFZtdHdWbFl4WkRSaU1rbDRZa1prWVZKc1dsTlpiRlpoWWpGU1dHUklaRlJXYkZZMVdrVmpOVll5U2xaalJXeGhWbGRTZGxacVNrdFRSbFp5VDFaV1YySklRalpXYlhCSFdWWmtXRkpyWkdoU2F6VndWVzAxUWsxV1dYaFhiR1JhVmpCV05GWlhOVk5WTWtweVRsWnNXbUV4Y0doV01GcHpWbFpPYzFOck5WTmlTRUpIVmxjeGVrNVdXa2RUYmtwcVUwZDRXVmxVUm5kVVJteFhWMnQwYTFacldqRlhhMXByWVZaa1IxTnNiRmRpVkVJMFZsY3hWMUl4Y0VsV2JGcHBWMFZLVlZadGNFZFRNV1JIVjJ0a1dHSlZXbkpVVlZKSFUwWmFkRTVXVG1oTlZXOHlXV3RhYTFadFNsbGhSRTVWVmxad2VsWnRlR3RqTVZKeldrWmthVk5GU21GV01WcFhWakZWZUZkc2FGUmhNbEp4VldwS2IxWXhVbFphUms1WVVteFdNMVl5ZERCaE1VbDNUbFZrVldKR2NISldSM2hoVjBkUmVtTkdaR2xYUjJoVlZsaHdRbVZHVGtkVGJHeG9VakJhVkZacVNtOVdiR1JZWkVkMGFVMXJiRFJXYlRWSFZXMUtWbGRzYUZwaE1YQXpWRlphY21ReFpIUmtSMmhPWVROQ1NWZFhkRk5VTVZsM1RWaEdWMkV6YUdGWlZFWjNWRVp3Umxkc1pHdFdNSEJJV1ZWa2IxUnNaRVpUYWxaWFlsUkdNMVY2Umt0a1JscDFWR3hPYVdFeGNGcFhWM1JyVlRKSmVGVnNhR3hTV0VKUFdXdGFkMU5HV1hsTlZXUllVakJ3V0ZZeWVHOVdNVW8yVm10NFdGWnNjRXRhVm1SSFVtczVWMVpzWkd4aE1XOTVWbTF3UjFsWFJYaFhXR2hYWW10d2FGVnFUbE5VTVd4VlVtNWtWRlpzY0hoVk1uQlRWakF4VjFacVZsWk5ibEp5V1ZkNFQxSnJOVmRhUm5CcFVtdHdTVlp0ZEdGVk1WbDRXa2hPV0dKWWFGaFVWVkpTWlZaYVdFMVVVbWxOVmxZMVZXeG9kMVZ0U2xaWGJHaFhZbFJHVDFwVlduTmpWa3B6WTBkNFUySldTbUZYVkVKcllqRmFSMU5zYkZKaVIxSlpWbTE0UzFSR1dsVlNhemxUVFZad01WVnRlRTloVmxwWFkwWndWMkpVUVhoVlZFWlBWakpLUjFkc2FHaGxiWGhaVmtaYVlXUXhUa2RYV0d4c1VqTlNXRlJYZEhkVFZscElZMFpPVjFZd1ZqVldWM2hQV1ZaYWMyTkhhRnBOYm1nelZXcEdkMU5IU2toaVJrNVlVbFZ3VTFadE1IZGxSVFZJVWxob1YxZEhhRmxXTUdSdlZqRnNjbHBIT1dwaVJsWXpWMnRhVDFkR1NuTlRiR2hZWVRGS1ZGbHJXbUZqTWs1SFlrWldWMkpHY0hsV2FrSmhVekpPZEZOclZsVmlSMUp3VmpCV1MxWldXbkZUYm5Cc1VtczFTVlZ0ZEdGaVJrcDBWV3M1V21KVVJuWlpha1poWTFaR2RGSnNaRTVoZWxZMlYxUkNWMkl4VlhsVGEyaFdZa2RvVmxadGVHRk5NVnBZWlVkR2FrMVdXbmxXUjNocllVZFdjMWRzYkZkaGExcDJXV3BLUjJNeFRuTmhSMmhUWlcxNFdGZFdaREJrTWxKelYydFdVMkpHY0hKVVZscDNaVlp3UmxaVVJtaFdhM0F4VlZab2ExWXhTbk5qUmxKV1ZrVmFhRmt5YzNoV01XUnlUbFprVTJFelFscFdiVEIzWlVkSmVWWnVUbGhpYkVwUFZteGFkMk14V25SbFIwWnNZa1p3TUZwVmFHdGhSbHAwVld0b1ZrMVhhRE5XTUZwYVpXeEdjbUZHWkdoaE0wSlJWbTF3UjFVeVVsZFdiazVTWWtkU2NGWnRkSGRXYkZsNFdrUlNhVTFyYkRSWmExcHJXVlpLYzFkc1RscGlSbHBvV1dwR1UxZEhVa2xhUmxKWFlrWlpNVmRyVm1wT1ZsbDVVMnhhYWxKWGVHaFdiR1JUWkd4YWNWTnJkRlJTYkZwNVZERmFWMVl5U2tsUmFscFhZbGhDU0ZkV1dtdFhSa3B5V2tkb1UyRjZWbmRXVnpBeFVXc3hWMWRZYUZoaVIxSmhWbXBDVjA1R1dsaE9WazVXVFd0d2VWUnNXbk5YYlVWNFkwZG9WMDFHY0hwV2JGcFBZekZPY2s1V1RtbFRSVXBJVm14amVHVnJOVmRYYTJSVVltdHdjVlV3WkRSWFZteFZVbTVrVlZKdGVGaFdNblF3WVRGYWNsZHFSbGhoTVhCeVdWWmFhMUl4VG5OaVJtUnBWa1ZKTUZac1kzaFdNVWw0WTBWc1YySkdjSEJXTUZaTFZsWmFkR05GWkZSTlYxSklWakowYTFkSFNsWlhia3BWVmxaS1NGVXdXbUZrUlRWWFZHMW9UbFpYT0hsWFYzUmhZVEZhZEZOc2JHaFNSVFZXVm14YWQyVnNXblJsUjNScVRWWktlbGRyV210aFZrNUdVMjFHVjAxV2NGaFdha1pXWlVaa2MyRkdVbWhOYkVwNFZsZHdTMkl4V1hoVmJGcGhVbXMxV1ZWdGVGZE5NVmw1WkVSQ2FFMVZiRE5VYkZaclZsZEtSMk5JU2xkU00yaG9WakJrVW1WdFRrZGFSMnhZVWpKb1ZsWnNhSGRSYlZaSFZHdGtWR0pIZUc5VmFrSmhWa1phY1ZOdE9WZGlSMUpaV2tWa01HRlZNWEppUkZKWFlsUldTRlpYTVV0V2JHUnpZVVp3YUUxWVFYcFdSbHBoWTIxUmVGcElVbXRTTW1oUFdWUk9RMU5XWkZoa1JrNVZUVlpzTTFSV2FFZFdNa1Y2WVVkR1dsWkZXak5XUlZwM1VteGtjMXBIZEZkTlNFSkpWMVpXYTJJeFVuTmFSVnBVWVd4YVdGbHNVa2ROTVZZMlVtdDBhMUpzY0RGV1IzaFBZVmRGZUdOR2NGaFdNMUp5VmxSS1NtVkdWbk5oUjNoVFYwVktVRmRYZEdGa01VNUhWMWhzVGxaWFVsbFZha0ozVjBaWmVXVkhPVmROVlc4eVZtMTBORmRHV25OalJYUmhWbTFTU0ZWdGN6VldNVnB6V2tkNGFFMVhPVFZXYlRGM1VqRnNXRkpZYUZoWFIyaFlXVzEwZDJGR1ZuRlViRTVWVFZkNFZsVnROV3RXUmxwelkwaG9WazF1UWxSV2FrRjRWakZhY1Zac1drNWliRXA1VjFaa05GUXhTbkpPVm1SaFVtNUNjRlZ0ZEhkVFZscDBaRWRHVmsxV2JEUlhhMmhQVjBkS1dXRkdhRlZXVmtwVFdsWmFZVmRGTVZWVmJXeE9WbXhaTVZaWGVHOWtNVlowVTJ4YVdHSkhhRmhaYkZKSFZERndWbGR1VG1wV2EzQjZXVEJrTkZVeVNsZFRiVVpZVmtWS2NsbHFSbUZTTVU1ellrWkthVkl4U2xwV2JURTBVekZTUjFkc1ZsTmhlbXhVVkZaYWQwMVdWblJsUlRsb1ZteHdXRmt3V25kV01rcFZVVmhvVmxaRldsQldha3BMVWpGa2MyRkhhR3hpV0doYVZtdGFZVmxYVVhoVWEyUllWMGQ0YzFWcVRsTlhSbEpZWkVoa1ZGWnNjRWxaTUZwUFZqRlpkMVpxVmxkV00yaFFWMVphWVdNeVRraGhSbkJPWW0xbmVsWlhjRWRrTVU1SVUydG9hVkpyTlZsVmJGWjNWVEZhZEUxSVpHeFNWRlpKVld4b2IxWXhaRWhoUm14WFRVZFNkbFpxUm5OamJIQklUMWR3YVZKc1dYcFdNblJoVkRGa1IxTnVUbFJpUjJoWldWUkdTMVpHYkhKWGJVWnFUV3MxU0ZscldsZFdNa3BIWTBoc1YyRnJTbWhWZWtwUFpFWk9jbFpzVm1sU01VcDNWbXBDVjFNeFdsZFhia1pVWW01Q2MxVnRjekZUVmxaMFpFZEdhVkpyY0RCV1YzTTFWMjFLVlZKdVdscGhhMXBvV1RGYVIyUkdTbk5h</span></p>\n<p> </p>\n",
				
				"date_published": "2015-11-06T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/11/06/vmwdqy.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/11/05/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Subtracting by Comparing - Propeller",
				"content_html": "<p>If you're familiar with x86, you'd know that a CMP (compare) instruction is the same as a SUB (subtract) instruction, with exception to the destination operand being written to. In other words, CMP is like a subtract, it just doesn't write the value when done (but it does set all of the flags that a subtract would. so CMP is used preceding conditional jmps; as conditional jumps base their conditions on the flags. Although this post will focus on CMP and SUB, the same tricks apply to TEST and AND (this goes for both x86 and Propeller). The main difference with x86 and Propeller in this respect is that x86 has completely different machine-code for these instructions. Not the case for Propeller.</p>\n<p>In my previous blog entry on <a href=\"http://xlogicx.net/?p=343\">Propeller NOPs, </a>I went into some detail on the instruction format of the Propeller micro-controller. The main parts that are important for this entry are the 6-bit op-code part, and the ZCRI flag section (mostly just the R part of it).</p>\n<p>Let's compare the CMP and SUB instructions</p>\n<p><strong>CMP:</strong></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/27a100fa59.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/27a100fa59.jpg\" width=\"698\" height=\"69\"></a></p>\n<p><strong>SUB:</strong></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d9bd6bfceb.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d9bd6bfceb.jpg\" width=\"701\" height=\"68\"></a></p>\n<p>So the op-code (100001) part is the same for both. Really, the only actual difference in the binary that makes up these two instructions is the R field. The R field specifies whether the value/pointer in the -DEST- field will get modified. However, we can override this default behavior within the Propeller assembly language with nr/wr (don't write/do write, respectively). This means we can over-ride a CMP into being a SUB and a SUB into becoming a neutered CMP. So obviously we will do that:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e71710de96.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e71710de96.jpg\" width=\"334\" height=\"186\"></a></p>\n<p>And to see the Machine Code:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d175e773fa.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d175e773fa.jpg\" width=\"491\" height=\"46\"></a></p>\n<p>Our encoded instruction is 0x0204FC84. You will see that value repeat twice in a row, because the 'sub' and 'cmp' instructions seen in the assembly are identical in machine code. If you are skeptical that the cmp actually subtracts (because you all are), let's debug (with my favorite debugger/emulator: GEAR):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/d16367e207.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/d16367e207.jpg\" width=\"568\" height=\"96\"></a></p>\n<p>It is shown as 2 SUBs. The last byte of the 3rd line is our 1-byte \"AVariable\" (Propeller is far from Harvard architecture). So if we subtract 2 from $55, we should have $53:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/97ce5de4ce.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/97ce5de4ce.jpg\" width=\"433\" height=\"98\"></a></p>\n<p>And we run the 2nd SUB (the assembly CMP) and we get our expected $51 from subtracting 2 again:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/0f7aa53242.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/0f7aa53242.jpg\" width=\"427\" height=\"51\"></a></p>\n",
				
				"date_published": "2015-11-05T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/11/05/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/10/20/assembly-is-too-highlevel-aadaam.html",
				"title": "Assembly is Too High-Level - AAD-AAM, Even the Math is Too High-Level",
				"content_html": "<p><a href=\"https://xlogicx.micro.blog/uploads/2026/2d8861bc92.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/2d8861bc92.jpg\" width=\"915\" height=\"41\"></a></p>\n<p>Oh boy, I love seeing words like these! Even though this post will focus on the AAD instruction, this applies to the below two instructions:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/2d5562f74a.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/2d5562f74a.jpg\" width=\"918\" height=\"150\"></a></p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/90cf341a79.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/90cf341a79.jpg\" width=\"920\" height=\"152\"></a></p>\n<p>I particularly love this one; because we get to see an illustration of abstractions being misleading on a few levels (assembly being too abstract, and even a mathematical formula being used too abstractly). I'm about to get all kinds of philosophical up in here!</p>\n<p>When approaching a tool or a system, normal people only look at what it is <strong><em>supposed to do</em></strong>; what it is <strong><em>intended</em></strong> for (and that's even assuming too much for a normal person sometimes). As a hacker, we aspire to look at a system for what it <strong><em>actually</em> </strong>does. Sometimes we have to hack around just to discover this. But sometimes it isn't even hidden; it can be well documented. Even so, normal people only care about the useful abstractions, and still ignore what systems <strong><em>actually</em></strong> do.</p>\n<p>Assembly Abstraction: AAD = <span style=\"color: #0000ff;\"><strong>ASCII Adjust AX Before Division</strong></span><br>\nThis instruction is intended to take two 8-bit BCD values and convert them to one 8-bit 'binary' value. For those that don't know, BCD stands for Binary Coded Decimal. It's a way of representing (only) decimal values in a binary/hex encoding. To represent the decimal number of 79 in hex, it would look like 0x4f (just 1 byte). In BCD, it could look like 0x79. In BCD, we ignore the A-F values of hex, even though we are still using standard 4-bit nibbles and 8-bit bytes. This means we are wasting data space. Even worse, AAD takes an 8-bit value for each digit, so 79 would actually be 0x0709.</p>\n<p>For AAD, the first byte (07 in this case) is in AH, and the next byte is in AL (09). After using the AAD instruction, the result is put into AL. The instruction mentions AX because AX is AH:AL (Accumulator High, Accumulator Low). So if AH had 0x07, and AL had 0x09, and we then ran AAD, AL would then have 0x4f (decimal 79). That's what this instruction is <strong><em>supposed</em></strong> to do.</p>\n<p>Let's look at the machine code for AAD (right next to the assembly that created it):</p>\n<p><a><img src=\"https://xlogicx.micro.blog/uploads/2026/screen-shot-2015-10-20-at-5.29.55-pm.png\" width=\"600\" height=\"69\" alt=\"\"></a><br>\nD5 is the actual machine-code for AAD. 0A is put there by our assembler for us, it is used for base 10 (notice we didn't say 'aad 10' in our source file). We can't change this value in assembly. As the Intel Manual states, this can only be done in machine code. Let's do a base 2 conversion. AH will be 01, and so will AL. So if AX is 0x0101, AAD (2) should yield 0x03 in AL. This is because 11 in binary is 3 in decimal.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/afb7d22522.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/afb7d22522.jpg\" width=\"639\" height=\"66\"></a></p>\n<p>As you can see, AL (seen in EAX register) has the value that we expected.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e73f79e80c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e73f79e80c.jpg\" width=\"921\" height=\"64\"></a></p>\n<p>So by unlocking this 'base' byte, we can convert from arbitrary bases, as Intel states. But we are not at the bottom of this abstraction stack yet. Converting from any base is what it is <em><strong>supposed</strong></em> to be used for at this level...again, what does it <em><strong>actually</strong></em> do.</p>\n<p>Mathematical Abstraction:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/51bfa128f5.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/51bfa128f5.jpg\" width=\"632\" height=\"274\"></a></p>\n<p>So really: <strong>AL = AL + (AH * base)</strong><br>\nWhere you provide the 1 byte base.</p>\n<p>This formula absolutely does what it is supposed to do. So we are done, we slap the <span style=\"color: #ff0000;\"><strong>\"dutifully converts from base\"</strong></span> label on this formula and that is what it <strong><em>is</em></strong>. Abstractions make life so easy on our stupid brains. But converting bases isn't really what it <strong><em>does</em></strong>. It <strong><em>can</em></strong> do that, but what it <strong><em>really</em></strong> does is <strong>AL = AL + (AH * base)</strong>. Just so you don't think I'm splitting hairs like usual. Consider that the 1 byte base value can be any value from 0x00-0xff. So what does it mean to convert 0x0709 from base 1, or base 0 <span style=\"color: #999999;\">(<em>just so you know: not a thing</em>)</span>? Even base 2 doesn't make sense here, because 7 and 9 are not valid binary characters. This doesn't mean we still can't AL = 9 (7 * 2) and get a value anyway (0x17).</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/6ea2451b4c.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/6ea2451b4c.jpg\" width=\"637\" height=\"62\"></a></p>\n",
				
				"date_published": "2015-10-20T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/10/20/assembly-is-too-highlevel-aadaam.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/09/27/assembly-is-too-high-level.html",
				"title": "Assembly is Too High Level - Propeller NOPs, like if never",
				"content_html": "<p>This post is not about x86/64, as the title goes, this is about the Propeller architecture.</p>\n<p>But first, here's some x86 (hahaha):<br>\n<img class=\"alignnone size-full wp-image-345\" src=\"http://xlogicx.net/wp-content/uploads/2015/09/Screen-Shot-2015-09-27-at-5.42.39-PM.png\" alt=\"xchg\" width=\"757\" height=\"27\"><br>\nNote that if you encode the XCHG instruction with EAX, EAX, it would just be 0x90; the exact same machine code as:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/79e5c88c51.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/79e5c88c51.jpg\" width=\"828\" height=\"27\"></a></p>\n<p>Sorry to have to put everyone though so much x86 for a Propeller post, but I think there's a very strong likeness here; the propeller chip also does not have dedicated machine-code for the NOP. I find a propeller NOP to be much more interesting.</p>\n<p> </p>\n<h4><strong>About Propeller ASM and Machine-Code structure:</strong></h4>\n<p>Unlike x86, machine instructions for Propeller are not variable in size; each instruction occupies 32-bits. There are 6 bits that specify which instruction (up to 64 possible different instructions), 2 flag setting bits (Zero and Carry), a bit that specifies if the destination operand gets written to, a bit specifying immediate or memory location mode, 4 bits for flag tests, 9 bits for the destination, and another 9 bits for the source operand. This totals up to 32-bits. EVERY instruction has this structure. It may be complicated at first to divide up these bits, but once you get past that, this architecture is incredibly simple and elegant.</p>\n<p>Below is a table that describes the WRBYTE instruction. I picked this one because it is the first binary instruction (000000):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/15f25ca53d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/15f25ca53d.jpg\" width=\"332\" height=\"47\"></a></p>\n<p>Although, for fun, We can look at the last binary instruction (WaitVid):</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/7a8a44f32d.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/7a8a44f32d.jpg\" width=\"331\" height=\"46\"></a></p>\n<p> </p>\n<p> </p>\n<h4><strong>Conditional Checks (-CON-):</strong></h4>\n<p>4 bits isn't that much, let's spell out each of them. Where:<br>\nZ is Zero Flag<br>\nC is Carry Flag<br>\n* is AND<br>\n! is NOT<br>\n= is Equal</p>\n<p>0001: !C*!Z<br>\n0010: !C*Z<br>\n0011: !C<br>\n0100: C*!Z<br>\n0101: !Z<br>\n0110: C!=Z<br>\n0111: !Cv!Z<br>\n1000: C*Z<br>\n1001: C=Z<br>\n1010: Z<br>\n1011: !CvZ<br>\n1100: C<br>\n1101: Cv!Z<br>\n1110: CvZ</p>\n<p>I skipped 2 special values. 1111 means IF_ALWAYS, most instructions default to this condition if not specified (you will see this above in the tables for WRBYTE and WaitVid. That leaves 0000: IF_NEVER. In Assembly, you can literally put IF_NEVER in front of any instruction you feel like, and it effectively makes it a NOP! In machine code you would just 0 out those 4 bits.</p>\n<p>Although, if you actually write 'NOP' in assembly, it has to pick one of these variations. If you were to assemble a NOP and the disassemble it, it would look like a carefully crafted WRBYTE instruction.</p>\n<h4><strong>PoC:</strong></h4>\n<p>Let's look at some propeller Assembly code with a NOP, a couple WRBYTEs and a corresponding machine-code dump. I drew arrows from the assembly to the corresponding machine-code for clarity.</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/aba1035bff.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/aba1035bff.jpg\" width=\"495\" height=\"352\"></a></p>\n<p>Note that the memory model is Little-Endian (data goes backward per byte). Let's break apart the 2nd wrbyte instruction into its pecies, as it's the main non-zero instruction in the dump. I will take the hex and separate it out into the INSTR, ZCRI, CON, DEST, and SRC bit fields.</p>\n<p>Hex 0x03063C00 = Convert from Little Endian -&gt; 0x003C0603 = binary 0000 0000 0011 1100 0000 0110 0000 0011</p>\n<p>Clearer grouping = 000000 0000 1111 000000011 000000011</p>\n<p>000000 = WRBYTE</p>\n<p>0000 = Don't set Zero or Carry flags, R = 0 means Write in the context of this instruction, R = 1 would be Read, I = 0 means non-immediate value (pointer)</p>\n<p>1111 = Same as saying if_always</p>\n<p>000000011/000000011 = This refers to the 3rd (counting from zero) 32 bit value in memory (including instructions!). Looking at the assembly, nop is the 0th, 'wrbyte variable, variable' is 1st, 'wrbyte start, start' is 2nd, and our variable declaration is the 3rd. So that's how this happens. Propeller interestingly does not have a distinct code/data isolation.</p>\n<p> </p>\n<h4><strong>Create a valid NOP without a NOP:</strong></h4>\n<p>We see that assembling a NOP produces 0x00000000 in machine code. Let's see what it would take to write some assembly that would produce this same value without using an actual NOP in assembly (spoiler in the above image). I will first break down our zeros into their fields:</p>\n<p>000000 = WRByte</p>\n<p>0000 = Don't write flags, already Write mode with 0 bit, and we have to use pointer mode; not immediate</p>\n<p>0000 = we can only get this set of 0's by specifying the if_never prefix</p>\n<p>000000000/000000000 = This is the tricky part (kind of). This has to refer to the 0th 32-bit memory value (our first instruction). Turns out that our Start label contains just that (just as it likewise would in x86 assembly). This means that we can use our 'Start' label for the Dest and the Source operands. We end up with 'if_never wrbyte Start, Start.' As seen in the machine dump, this did infact produce our all 0's machine code identical the real NOP.</p>\n<p>So many flavors of NOP:</p>\n<p>If you set the CON field to 0000 on any instruction (either directly in machine code or by using the if_never), you effectively make it a NOP. A NOP takes 4 clock ticks. Instructions that take otherwise significantly more clock ticks should only take the 4 clock ticks if you shove an if_never into the CON field.</p>\n<p>Post Disclaimer: None of this is really useful knowledge, it's just fun to hack for hacking's sake.</p>\n",
				
				"date_published": "2015-09-27T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/09/27/assembly-is-too-high-level.html"
			},
			{
				"id": "http://xlogicx.micro.blog/2015/09/19/obscure-fisting.html",
				"title": "Obscure FISTing",
				"content_html": "<p><a href=\"https://xlogicx.micro.blog/uploads/2026/e5ab1f7525.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/e5ab1f7525.jpg\" width=\"323\" height=\"239\"></a></p>\n<p>In the realms of assembly obfuscations, this isn't extremely high in complexity. This is me finding an excuse to use the \"FIST\" instruction. In the context of the PoC, it is being used as a decoder for some encoded shellcode. Before going into the super technical details, below is a video of the PoC in action.</p>\n<p></p><p>[www.youtube.com/watch](https://www.youtube.com/watch?v=e-14WFLCl-M)</p>\n<h4></h4>\n<h4><strong>Floaters:</strong></h4>\n<p>Let's talk about floating point numbers (as it applies to x86). I've said this a ton, but data is meaningless without context. For example 0xfe could be 254 decimal, but if it were a signed integer, it would be -2. The sign is defined by the most significant (left-most) bit, when interpreted as a signed int. Floating point representation has it's own complex structure. In a more standard notation, it's what gives us the ability to represent numbers like 13 * 10^4. However, everything is still represented base 2, so it's more like 1.9837 * 2^16. With this type of notation, we can represent very large and very small numbers (with a loss of precision). In a 32-bit floating point number, we have 1 bit for the sign (and these are not 2's compliment), 8 bits for the multiplier (called the bias), it can represent positive and negative (fractional), and the remaining 23 bits represent the number before modification (multiplying out). If only it were that simple though, this 23 bit number has an implied leading 1. So 1.265 would be stored as 265 and the 1 is assumed. These 23 bits also represent fractional binary.</p>\n<h4></h4>\n<h4><strong>Fractions:</strong></h4>\n<p>Let's talk about fractional binary. We know that each increasing bit represents twice as much as the bit before it. Well after the decimal place, each bit represents half as much. Lets look at a non-fraction to illustrate the first point: 1011. The right most digit represents 1, the next 2, then 4, then 8. We have 1, 2, and 8 'selected', so it's 11 decimal. What is 1011.1011. Starting from left to right after the decimal, we have 1*.50 + 0*.25 + 1*.125 + 1*.0625. The fractional part is .6875, so the entire number would be 11.6875 as a decimal number.</p>\n<h4></h4>\n<h4><strong>Biased:</strong></h4>\n<p>The 8-bit bias part of our data rests on 0x7f. larger than 0x7f is a larger exponent, smaller than 0x7f is a fractional exponent.</p>\n<h4></h4>\n<h4><strong>Exercise:</strong></h4>\n<p>Let's encode 11.6875 (decimal) into a floating point number. First we must convert it to binary, of which we already had above; it's 1011.1011. Now we have to push it back a few decimal places to get it to be a 1.xxxx... format: 1.0111011. We had to push this 3 places, so our exponent will be 3. We add this 3 to 0x7f getting us 0x82 for the bias bits. The sign bit will be 0 for positive. As far as representing our binary in 23 bits, we are supposed to remove the leading 1 and pad with 0's until it's 23 bits. It helps me to look at the full 32-bit value in binary before making it hex. In order, we have sign bit (0), the bias (1000 0010 / 0x82), and the number (0111 011 0000000000000000). We combine all of that to get: 01000001001110110000000000000000. In hex this is 0x413b0000. So if 0x413b00000 is interpreted as floating point data, it represents 11.6875.</p>\n<h4></h4>\n<h4><strong>The FPU Stack:</strong></h4>\n<p>The FPU is a set of 8 additional registers (ST0-ST7). These registers are actually 80-bit each. The FPU is mostly interfaced like a stack; in the sense that we usually push data to it and pop data from it. To push data to the FPU stack, we can use the FLD instruction (floating point load) to take data from a location in memory and place it into ST0. You can then use FST (floating point store) to take the first FPU register (ST0) off of the stack and place it into memory. Both of these instructions assume the data is already encoded as floating point.</p>\n<h4></h4>\n<h4><strong>Conversions and FISTing:</strong></h4>\n<p>One cool thing is that you can have your integer data converted into floating point with the FILD instruction (I for integer). This takes an integer from memory, encodes it as floating point, and pushes it to the FPU stack. Likewise, you can take floating point data from the FPU and convert it back to an integer with FIST (again, I for integer). For practical reasons I use FISTP (P for pop). Interestingly, by default FST/FIST accesses the data on the top of the FPU stack without actually removing it (popping). So if you keep pushing with FLD/FILD and not popping, you'll run into problems.</p>\n<h4></h4>\n<h4><strong>Obscurely FISTing a shell:</strong></h4>\n<p>The idea is to have a bunch of shellcode encoded into memory as floating point data, only to have FIST decode it back into memory and then execute. It is unreasonable expect to be able to convert every 32-bit instruction into something that equates as a 32-big floating point number. It is far easier (for me) to just use the least significant 2 bytes of the integer from a float as the machine code. This means I need to take some shellcode and break it up into 2-byte chunks. Each chunk needs to be encoded as floating point data. I did this manually...There were some steps in between, but for posterity, here's the scratchpad text file I was working around in:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/aaaac27172.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/aaaac27172.jpg\" width=\"562\" height=\"448\"></a></p>\n<h4></h4>\n<h4><strong>Manually Converting:</strong></h4>\n<p>The idea is to manually convert 2 bytes into a 4-byte float that will be decoded by FIST where the least significant 2 bytes will be the original 2 bytes we were encoding. Let's look at 'Int 0x80' (this was the last line in my sloppy work screenshotted above). In machine code that is 0xcd80. Because of some Little Endian side effects, we will need to pre-reverse the bytes, so 0x80cd. In binary this is 1000 0000 1100 1101, We then shift this to 1.000 0000 1100 1101. That is 15 shifts, where 1.000 0000 1100 1101 * 2 ^ 15 would get us back to our 0x80cd. So let's encode. It's positive, so the sign big is 0. It's bias is 15, and we add that to the 0x7f, so that's 0x8e (1000 1110). And for the number, we ignore the leading 1 and pad with 0's to the right to get 23 bits, so 00000001100110100000000. All that binary together is 0100 0111 0000 0000 1100 1101 0000 0000 or 0x4700cd00 represented in hex. So if 0x4700cd00 was in ST0, FIST/FISTp would put 0xcd800000 into our memory location. I only end up using those first 2 bytes in my PoC.</p>\n<h4></h4>\n<h4><strong>What the PoC does:</strong></h4>\n<p>First note that this PoC has been modified where .text is rwx (can be modified on the fly), this was done manually in a hex editor (of which is also demonstrated in the video). It already has the encoded shellcode in memory with assembly dq directives and a handy pointer to it with 'data:'. I initialize a looping counter with ECX, specify where our program will end up starting with EDX, and get our data address into EBX. I then load the first encoded data with FLD, then decode it out into our program memory area with FISTP, I increment the pointers (data and program) and decode the next until I am done. When done, I jump to the start of the decoded program. There is nothing spectacular about the decoded program, just the same execve(/bin/sh) thing I always do.</p>\n<p>Below is a screenshot of the source:</p>\n<p><a href=\"https://xlogicx.micro.blog/uploads/2026/23725daedd.jpg\"><img src=\"https://xlogicx.micro.blog/uploads/2026/23725daedd.jpg\" width=\"647\" height=\"546\"></a></p>\n",
				
				"date_published": "2015-09-19T08:00:00-04:00",
				"url": "https://xlogicx.micro.blog/2015/09/19/obscure-fisting.html"
			}
	]
}
