<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>HALCODE</title>
	<atom:link href="http://www.halcode.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.halcode.com</link>
	<description>Computer Science Reviews: Passion for (my badly written) code</description>
	<pubDate>Mon, 23 Jun 2008 12:56:31 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>Resources for Articulatory Synthesis Research</title>
		<link>http://www.halcode.com/archives/2008/06/10/resources-for-articulatory-synthesis-research/</link>
		<comments>http://www.halcode.com/archives/2008/06/10/resources-for-articulatory-synthesis-research/#comments</comments>
		<pubDate>Mon, 09 Jun 2008 21:50:26 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Speech Technologies]]></category>

		<category><![CDATA[acoustic phonetics]]></category>

		<category><![CDATA[anatomy]]></category>

		<category><![CDATA[articulatory synthesis]]></category>

		<category><![CDATA[fant]]></category>

		<category><![CDATA[flanagan]]></category>

		<category><![CDATA[formants]]></category>

		<category><![CDATA[machine learning]]></category>

		<category><![CDATA[maeda]]></category>

		<category><![CDATA[models]]></category>

		<category><![CDATA[phonetics]]></category>

		<category><![CDATA[physiology]]></category>

		<category><![CDATA[síntesis articulatoria]]></category>

		<category><![CDATA[spectrum]]></category>

		<category><![CDATA[speech]]></category>

		<category><![CDATA[speech signals]]></category>

		<category><![CDATA[vocal tract]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=29</guid>
		<description><![CDATA[A list of important documents for Articulatory Speech Synthesis and Inversion research.]]></description>
			<content:encoded><![CDATA[<p>Recently I've been asked for some leadings to more resources about Articulatory Speech Synthesis. The articulatory approach is a very captivating research topic, but it's relatively hard, and is based on a hefty amount of multidisciplinary documents and results. Some involved papers and books are somewhat old or difficult to find. This is my list of selected resources:</p>
<p><strong>Books</strong></p>
<ul>
<li>Gunnar Fant - <a href="http://www.amazon.com/Acoustic-Production-Description-Analysis-Contemporary/dp/9027916004" title="Fant Acoustic Theory of Speech Production" onclick="javascript:pageTracker._trackPageview ('/outbound/www.amazon.com');">Acoustic Theory of Speech Production</a></li>
<li>James L. Flanagan - Speech Analysis, Synthesis and Perception</li>
<li>Kenneth N. Stevens - <a href="http://www.amazon.com/Acoustic-Phonetics-Current-Studies-Linguistics/dp/0262692503" title="Stevens Acoustic Phonetics" onclick="javascript:pageTracker._trackPageview ('/outbound/www.amazon.com');">Acoustic Phonetics</a></li>
<li>Paul Boersma - Functional Phonology</li>
<li>J. M. Pickett - <a href="http://www.amazon.com/Acoustics-Speech-Communication-Fundamentals-Perception/dp/0205198872/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1213044654&amp;sr=1-1" onclick="javascript:pageTracker._trackPageview ('/outbound/www.amazon.com');"> The Acoustics of Speech Communication</a></li>
<li>D. G. Childers - <a href="http://www.amazon.com/Speech-Processing-Synthesis-Toolboxes-Childers/dp/0471349593" title="Speech Processing and Synthesis Toolboxes" onclick="javascript:pageTracker._trackPageview ('/outbound/www.amazon.com');">Speech Processing and Synthesis Toolboxes</a></li>
<li>A. Seikel, D. King and D. Drumright - <a href="http://www.amazon.com/Anatomy-Physiology-Speech-Language-Hearing/dp/1401825818" title="Anatomy and Physiology for Speech" onclick="javascript:pageTracker._trackPageview ('/outbound/www.amazon.com');">Anatomy and Physiology for Speech, Language and Hearing</a>. Lovely book.</li>
</ul>
<p><span id="more-29"></span><strong>Papers, Theses and Slides<br />
</strong></p>
<ul>
<li>P. Mermelstein - <a href="http://www.halcode.com/docs/mermelstein.pdf" title="Merrmelstein Articulatory Model">Articulatory Model for the Study of Speech Production</a>. Mermelstein's paper is a classic, and a mandatory read for anyone interested in the subject of articulatory synthesis and inversion. It describes the most used articulatory model to date. Albeit 3D articulatory models are the current trend, Mermelstein's model is still very useful for research. This version has some low-quality pages. I'll provide a better scanned version later.</li>
<li>Michael Portnoff - <a href="http://www.halcode.com/docs/portnoff.pdf" title="Portnoff">A Quasi-one-dimensional Digital Simulation for the Time-varying Vocal Tract</a></li>
<li>J. Dang and K. Honda - <a href="http://www.halcode.com/docs/dang_honda.pdf" title="Physiological Articulatory Model">Construction and Control of a Physiological Articulatory Model</a></li>
<li>I. Howard and M. Huckvale - <a href="http://www.halcode.com/docs/howard.pdf" title="Howard articulatory synthesizer">Learning to control an articulatory synthesizer by imitating real speech</a></li>
<li>Olov Engwall - <a href="http://www.halcode.com/docs/engwall.pdf" title="Engwall 3D model">Vocal tract modeling in 3D</a></li>
<li>R. Sproat, M. Ostendorf and A. Hunt (Editors) - <a href="http://www.halcode.com/docs/sproat.pdf" title="The Need for Speech Synthesis Research">The Need for Increased Speech Synthesis Research</a></li>
<li>W. Hess - <a href="http://www.halcode.com/docs/hess.pdf" title="Wolfgang Hess Slideshows">Artikulatorische und akustische Phonetik</a> (German) This concrete presentation of articulatory and acoustic phonetics by professor Wolfgang Hess is one of the best in the web.</li>
<li>Qiguang Lin - Speech Production Theory and Articulatory Speech Synthesis.</li>
</ul>
<p><strong>A few of my own documents</strong></p>
<p>My research is just a drop compared to the above oceans. Although some of the ideas and assumptions I followed are already old, someone may find them useful:</p>
<ul>
<li>J. Brito - <a href="http://linkinghub.elsevier.com/retrieve/pii/S1568494606000639" title="Brito Genetic Learning of Vocal Tract Area Functions" onclick="javascript:pageTracker._trackPageview ('/outbound/linkinghub.elsevier.com');">Genetic Learning of Vocal Tract Area Functions for Articulatory Synthesis of Spanish Vowels</a>.</li>
<li>J. Brito and W. Rodríguez - <a href="http://ieeexplore.ieee.org/iel5/10898/34297/01635777.pdf?tp=&amp;isnumber=&amp;arnumber=1635777" title="Brito Multipopulation Learning Articulatory Models for Speech Synthesis" onclick="javascript:pageTracker._trackPageview ('/outbound/ieeexplore.ieee.org');">Multipopulation genetic learning of midsagittal articulatory models for speech synthesis</a>.</li>
<li>J. Brito - Técnicas de Aprendizaje Artificial Aplicadas al Problem Inverso de la Síntesis Articulatoria de Voz por Computadora (Spanish) My doctoral dissertation. I'll upload it later... I don't remember where it is <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
</ul>
<p>Finally, please notice that the list is by no means complete or comprehensive. Some very important papers are missing, such as Flanagan and Ishizaka vocal fold modeling, Maeda's simulation of the vocal tract, Rubin's description of a synthesizer, Sorokin's model and others. I'll be adding more items in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/06/10/resources-for-articulatory-synthesis-research/feed/</wfw:commentRss>
		</item>
		<item>
		<title>coLinux, int 80 on Windows and other rants</title>
		<link>http://www.halcode.com/archives/2008/06/02/colinux-int-0x80-on-windows-and-other-rants/</link>
		<comments>http://www.halcode.com/archives/2008/06/02/colinux-int-0x80-on-windows-and-other-rants/#comments</comments>
		<pubDate>Mon, 02 Jun 2008 15:41:18 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Assembler]]></category>

		<category><![CDATA[C]]></category>

		<category><![CDATA[Languages]]></category>

		<category><![CDATA[Operating Systems]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[abi]]></category>

		<category><![CDATA[api]]></category>

		<category><![CDATA[assembly]]></category>

		<category><![CDATA[colinux]]></category>

		<category><![CDATA[cpu]]></category>

		<category><![CDATA[cygwin]]></category>

		<category><![CDATA[interruption]]></category>

		<category><![CDATA[kernel]]></category>

		<category><![CDATA[Linux]]></category>

		<category><![CDATA[parameters]]></category>

		<category><![CDATA[system call]]></category>

		<category><![CDATA[tips]]></category>

		<category><![CDATA[ubuntu]]></category>

		<category><![CDATA[virtualization]]></category>

		<category><![CDATA[vmware]]></category>

		<category><![CDATA[win32]]></category>

		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=27</guid>
		<description><![CDATA[Generally speaking, an Application Binary Interface (ABI) is the interface between an application program and the operating system. Conceptually, it's related to the more well-known API concept. But ABIs are a low-level notion, while APIs are more leaned toward the application source code level.]]></description>
			<content:encoded><![CDATA[<p>Recently, a reader sent me an email exposing some problems he faced when trying to assemble on <a href="http://en.wikipedia.org/wiki/Cygwin" title="Cygwin" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Cygwin</a> a code originally targeted at Linux. The problem, as he stated, was that <code>int 0x80</code> didn't perform as expected. Well, plenty of explanations are pertinent:</p>
<p><strong>Cygwin</strong></p>
<p>Cygwin allows to run a collection of Unix tools on Windows, including the GNU development toolchain. However, at its core, cygwin is a library which translates the <strong>POSIX system call API</strong> into the pertinent Win32 system calls (system calls are often abbreviated as <strong>syscalls</strong>). Therefore, cygwin is a software layer between applications using POSIX system calls and the Win32 operating systems, which allows <em>porting</em> some Unix applications to Windows. This way you can, for instance, have the Apache daemon working as a Windows service. Other very attractive feature of Cygwin is its interactive environment: you can run your shell quite nicely, and run your Autoconf scripts, for example. However, <strong>porting means recompiling</strong>. There is no binary compatibility, and your program cannot run in computers without Cygwin (without <code>CYGWIN1.DLL</code>, more precisely). Furthermore, albeit some progress has been made, Cygwin is relatively slow (it's a POSIX compatibility<em> layer</em>, after all.) If possible, I prefer to recompile my applications directly with MinGW. For me, this allows for a faster development cycle. Note, though, that Cygwin can compile MinGW-compatible executables. It's just that, as aforesaid, I prefer to work with MinGW directly. I only work on Windows if I have to develop applications for Windows. But Linux's development tools are the best, and we can access several of them by using MinGW. I think that Cygwin is best suited for general cross-development and for handling complicated software porting.</p>
<p><strong>System Calls and int 0x80</strong></p>
<p>A system call is a request by an <em>active process</em> for a service performed by the operating system kernel. Remember that a process is an executing (running) instance of a program, and the active process is the process currently using the CPU. The active process may perform a system call to request creation of other process, for instance. Or perhaps the process needs to communicate with a peripheral device. In Linux on x86, <code>int 0x80</code> is the assembly language instruction that is used to invoke system calls. int 0x80 is a software interrupt, as it will be raised by a software process, not by hardware devices. Before invoking such interruption, our program have to store the <strong>system call number </strong>(which allows the operating system to know what service your program is specifically requesting ) in the proper register of the CPU. Every interrupt is a signal to the operating system, notifying it about the occurrence of an event that must be computationally handled.</p>
<p><span id="more-27"></span><strong>Application Binary Interface</strong></p>
<p>Generally speaking, an Application Binary Interface (ABI) is the interface between an application program and the operating system. Conceptually, it's related to the more well-known API concept. But ABIs are a low-level notion, while APIs are more leaned toward the application source code level. If your program uses an specific API, you will be able to <em>compile</em> it on any system which implements that API. Similarly, a program (or more generally, a compiled object code) which uses a specific ABI will run (without the need for recompilation) on any system offering that ABI. Specification of ABIs includes, but is not limited to, details such as:</p>
<p>1. How applications make system calls to the operating system.</p>
<p>2. System call numbers.</p>
<p>3. Calling conventions (how parameters are passed to functions, and how their return values are received).</p>
<p>4. The format of the object code (COFF/PE, ELF, etc.)</p>
<p>To consolidate ideas, notice that a relatively backward-compatible ABI is which has allowed several older applications to run on newer versions of Windows.</p>
<p>Now a few things should be quite clear. First, Cygwin is a software layer at the API level. Second, interrupts are a concept from the ABI level. And it's obvious that Linux syscalls are entirely different from Windows syscalls. Further, in Cygwin you don't make raw system calls... it's CYGWIN1.DLL which does the needed Windows system calls according to your program requirements. The ABI of software compiled with Cygwin is that of Windows systems. You cannot use a Linux ABI's thing, such as int 0x80, and expect that your program runs fine on Windows.</p>
<p>If you're programming in assembly, <code>0x80</code> only works in Linux. For DOS/Windows, you must use <code>0x21</code>, <code>0x25</code>, <code>0x26</code>, etc. That's the rationale behind my decision for using a function of the C library, printf, in order to avoid this problem in the example code of <a href="http://www.halcode.com/archives/2008/05/11/hello-world-c-and-gnu-as/" title="hello world, c and gnu as">this post</a>.</p>
<p><strong>The Truth Reflected in the Mirror</strong></p>
<p><em>You cannot use a Linux ABI's thing, such as <code>int 0x80</code>, and expect that your program runs fine on Windows</em>. That's not entirely true. Under Windows, you can install a <a href="http://msdn.microsoft.com/en-us/library/ms681411(VS.85).aspx" title="Vectored Exception Handler" onclick="javascript:pageTracker._trackPageview ('/outbound/msdn.microsoft.com');">vectored exception handler</a> which traps the interruption request, and with some coding translate the trapped information into the Windows equivalent. But that amounts to writing an Linux kernel emulator.</p>
<p>If you are in a hurry, perhaps you'll prefer the wise path of virtualization. By using a virtualization engine like <a href="http://en.wikipedia.org/wiki/VMware" title="VMWare" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">VMWare</a> and <a href="http://en.wikipedia.org/wiki/Xen" title="Xen" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Xen</a> you can install and execute a guest operating system on a host system. Such virtual machines offer a full resources abstraction which allows you to use the guest operating system almost at its full. For some tasks, the <a href="http://www.colinux.org/" title="coLinux" onclick="javascript:pageTracker._trackPageview ('/outbound/www.colinux.org');">coLinux</a>'s approach may be preferable.  Cooperative Linux (coLinux) allows the Windows and Linux kernel to run simultaneously on the same machine, but unlike traditional virtual machines, coLinux shares resources that already exist in the host.</p>
<p>coLinux is a very elegant and exciting project. However, installing and configuring coLinux may be a hard task. After installing your Linux distro, it's not easy to configure the window manager, the networking facilities, and other things. The best resource I've found in the net for such chores is this post by Sofeng: <a href="http://iwiwdsmi.blogspot.com/2008/04/install-colinux-and-ubuntu-gutsy-on-win.html" title="Sofeng coLinux" onclick="javascript:pageTracker._trackPageview ('/outbound/iwiwdsmi.blogspot.com');">Install coLinux (and Ubuntu Hardy) on Win XP using Slirp to internet and TAP to host behind a corporate firewall/proxy server</a>. It's a very recommended reading.</p>
<p>Nevertheless, I have a couple of quick tips for those who want to start programming in assembly as soon as possible. <em>Please, notice that software is constantly evolving and the following tips may be obsolete for the time you're reading them. Review the excellent coLinux documentation beforehand</em>. During coLinux setup, select the Ubuntu distribution. After installation, go to the folder where you installed it, and decompress the file Ubuntu-7.10.ext3.2GB.7z. Also put <a href="http://www.halcode.com/docs/UbuntuSlirp.bat" title="UbuntuSlirp">UbuntuSlirp.bat</a> and <a href="http://www.halcode.com/docs/colinux.conf" title="coLinux">coLinux.conf</a> into your colinux directory (these files may require some tweaking according to your system.) Run coLinux by executing UbuntuSlirp.bat. Linux should boot nicely. Use <code>root</code> as login and password. And that's it. You'll have Linux on Windows. But developers need the GNU development tools. Install them by executing apt-get install gcc</p>
<p>At this point, you should have the GNU assembler (as) installed. Now you can type your programs, and assemble/run them. And you can use <code>0x80</code> in your assembly program! For example, I typed the hello, world program of the <a href="http://database.sarang.net/study/linux/asm/linux-asm.txt" title="Miyagi's Assembly Tutorial" onclick="javascript:pageTracker._trackPageview ('/outbound/database.sarang.net');">tutorial by Miyagi</a>, and it assembled and ran just fine.</p>
<p>As always, your feedback is welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/06/02/colinux-int-0x80-on-windows-and-other-rants/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Articulatory Speech Synthesis</title>
		<link>http://www.halcode.com/archives/2008/05/24/articulatory-speech-synthesis/</link>
		<comments>http://www.halcode.com/archives/2008/05/24/articulatory-speech-synthesis/#comments</comments>
		<pubDate>Sat, 24 May 2008 15:13:36 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Speech Technologies]]></category>

		<category><![CDATA[acoustic]]></category>

		<category><![CDATA[articulatory]]></category>

		<category><![CDATA[articulatory speech synthesis]]></category>

		<category><![CDATA[corpus]]></category>

		<category><![CDATA[glottis]]></category>

		<category><![CDATA[models]]></category>

		<category><![CDATA[muscles]]></category>

		<category><![CDATA[phone]]></category>

		<category><![CDATA[phoneme]]></category>

		<category><![CDATA[phonological]]></category>

		<category><![CDATA[speakers]]></category>

		<category><![CDATA[speech]]></category>

		<category><![CDATA[speech synthesis]]></category>

		<category><![CDATA[synthesizer]]></category>

		<category><![CDATA[vector]]></category>

		<category><![CDATA[voice]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=25</guid>
		<description><![CDATA[Solution to the inverse problem is interesting, among other reasons, for the reduction of memory space and bandwidth requirements for storage and transmission of speech signals.]]></description>
			<content:encoded><![CDATA[<p>Today, we'll temporarily move away from assembly programming. It's time to discuss a theme that I like a lot: <strong>articulatory speech synthesis</strong>. Simply put, speech synthesis comprises all the processes of production of synthetic speech signals. Currently, the most popular method for such task is the concatenative approach, which yields synthetic speech output by combining pre-recorded speech segments. Such segments, recorded from human speakers, are collected into a large database, or corpus which is segmented based on phonological features of a language, e.g., transitions from one phoneme to at least one other phoneme. A <strong>phoneme</strong> is the smallest posited structural unit that distinguishes meaning. It's important to point out that phonemes are not the physical segments themselves, but, in theoretical terms, cognitive abstractions or categorizations of them. In turn, physical segments, referred to as <strong>phones</strong>, constitute the instances of phonemes in the actual utterances. For example, the words "madder" and "matter" obviously are composed of distinct <em>phonemes</em>; however, in american english, both words are pronounced almost identically, which means that their <em>phones</em> are the same, or at least very close in the acoustic domain.</p>
<p>On the other hand, articulatory synthesis produces a complete synthetic output, typically based on mathematical models of the structures (lips, teeth, tongue, glottis, and velum, for instance) and processes (transit of airflow along the supraglottal cavities, for instance) of speech. Technically, articulatory speech synthesis transforms a vector p(t) of anatomic or physiologic parameters into a speech signal Sv with predefined acoustic properties. For example, p(t) may include hyoid and tongue body position, protrusion and opening of lips, area of the velopharyngeal port, and so on. This way, an articulatory synthesizer ArtS maps the articulatory domain (from which p(t) is drawn) into the acoustic domain (where frequency properties of Sv lie). Computing the acoustic properties of Sv is the task of a special function. Now, using these definitions, the <strong>speech inverse problem</strong> is stated as an optimization problem, in which we try to find the best p(t) to minimize the acoustic distance between Sv and the output of ArtS.</p>
<p><span id="more-25"></span>The solution to the inverse problem is interesting for the following applications:</p>
<ol>
<li>Reduction of memory space and bandwidth requirements for storage and transmission of speech signals.</li>
<li>Low cost and noninvasive comprehension and recollection of data on phonatory processes.<br />
Speech recognition, by means of transition to the articulatory domain, where signals may be characterized by fewer parameters.</li>
<li>Retrieving the best parameters for synthesis of high-quality speech signals.</li>
</ol>
<p>However, because mapping between articulatory and acoustic domains is nonlinear and many-to-one, definition and achievement of acceptable solutions to the inverse problem are not trivial issues. Globally, qualifying a candidate solution follows some type of relation on the acoustical domain. Furthermore, from the family of solutions to the problem, we are frequently interested only in those configurations consistent to descriptions of articulatory phonetics. Several groups have approached this problem. For example, Yehia and Itakura adopted an approach based on geometric representations of the articulatory space, including spatial constraints. Dusan and Deng used analytical methods to recover the vocal tract configurations. Sondhi and Schroeter relied on a codebook technique. Genetic algorithms have also been used, albeit the approach and type of signals studied differ to those used in this research. These later studies mainly investigate relations between articulation and perception on the basis of the tasks of the task dynamic description of inputs to a synthesizer. More recent research recur to control points experimentally measured to a group of speakers, and inversion minimizes the distance between the articulatory model and the referred points, by using quadratic approximations. On our side, we have previously investigated the application of computational intelligence techniques to the speech inverse problem. Concretely, fuzzy rules for modeling the tongue kinematics, neural networks to generate the glottal airflow and genetic algorithms to carry out the overall optimization process. Another novelty of our previous research was the use of the five spanish vowels as target phonemes for inversion.</p>
<p><strong>Synthesis Models</strong></p>
<p>In a broader level, ArtS integrates two models: the articulatory and the acoustic model. An articulatory model represents the essential components for speech production, and its main purpose is computation of the area function A(x, t), which reflects the variation in cross-sectional area of the acoustic tube whose boundaries are located at the glottis and the mouth, respectively. Here, transitions between phonemes are not researched, and thereby the time variable will be dropped from the area function and from the vector p. On its side, an acoustic model specify the transformations between A(x) and the acoustic domain. Naturally, such mapping also requires information about the energy source exciting the tract. According to the <strong>acoustic theory of speech production</strong>, the target phonemes are considered as the output of a filter characterized by A(x) and excited by a periodic glottal signal.</p>
<p>In this post, we'll restrict our presentation to the <strong>Articulatory Model</strong>:</p>
<p><img class="aligncenter" src="http://www.halcode.com/images/artmodel.jpg" alt="Articulatory Model" width="577" height="488" />Phonemes of interest can be characterized on the articulatory midsagittal plane. Models of this type consolidated with Mermelstein’s paper, whose metrics and observations have been reutilized by several researchers, mainly for its explicitness and complete explanations. Midsagittal models describe position and movement of articulators on the plane. As the parameters of this model have an anatomical meaning, they simplify visualization of articulatory configurations, and contribute with the inversion techniques in rejecting some groups of unnatural configurations. Equilibrium position of our model, depicted in the above figure, is partially based on Mermelstein’s measurements. As the result of muscular contractions linked to articulation, the model may leave its equilibrium configuration, and ideally take on some articulatory configuration correspondent to a specific phoneme; therein lies the goal of inversion. In this respect, our articulatory vector p will group several supraglottal muscles whose activity alters the state of the midsagittal outline. In numerical terms, there are no complete and definitive data on this muscular activity and its kinematic effects. We may consider the muscular activity as a real number bordered by 0.0 and 1.0, denoting null and maximum muscular activity, respectively. General effects of muscular contractions are taken from approximations in the literature. Certainly, in order to cover the articulatory space, I like to store the following 12 muscles into p:</p>
<ol>
<li> Middle Pharynx Constrictor: This muscle influences horizontal displacement of the hyoid bone, and its surrounding zones, approximately by 0.5 cm.</li>
<li>Masseter(MA): Raises the jaw, following an angle up to 0.15 radians relative to temporomandibular joint.</li>
<li>Mylohyoid (MH): Descends the mandible, up to 0.20 radians respect to temporomandibular joint.</li>
<li>Styloglossus (SG): Retracts the tongue in direction of the styloid process.</li>
<li>Hyoglossus (HG): Descends and slightly retracts the tongue body.</li>
<li>Anterior Genioglossus (GGa): Descends the tongue tip.</li>
<li>Middle Genioglossus (GGm): Moves tongue body forward, and slightly downward.</li>
<li>Posterior Genioglossus (GGp): Raises and advances the tongue body.</li>
<li>Intrinsic tongue muscles (Up): Raise tongue tip.</li>
<li>Intrinsic tongue muscles (Down): Descend tongue tip.</li>
<li>Risorius: Spreads the lips, changing the width of the lip opening up to 4.5 cm.</li>
<li>Orbicularis oris: Acts for lip rounding, pulling the upper and lower lips together (with a respective maximum vertical change of 1 cm) and protruding them (up to 2.0 cm).</li>
</ol>
<p>In upcoming posts, we'll continue our review of articulatory speech synthesis. And I have yet to provide the references.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/05/24/articulatory-speech-synthesis/feed/</wfw:commentRss>
		</item>
		<item>
		<title>hello world, C and GNU as</title>
		<link>http://www.halcode.com/archives/2008/05/11/hello-world-c-and-gnu-as/</link>
		<comments>http://www.halcode.com/archives/2008/05/11/hello-world-c-and-gnu-as/#comments</comments>
		<pubDate>Sun, 11 May 2008 15:44:54 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Assembler]]></category>

		<category><![CDATA[C]]></category>

		<category><![CDATA[Languages]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[assembly]]></category>

		<category><![CDATA[at&amp;t]]></category>

		<category><![CDATA[bytes]]></category>

		<category><![CDATA[code]]></category>

		<category><![CDATA[coff]]></category>

		<category><![CDATA[Debug]]></category>

		<category><![CDATA[directives]]></category>

		<category><![CDATA[gas]]></category>

		<category><![CDATA[gcc]]></category>

		<category><![CDATA[gnu]]></category>

		<category><![CDATA[hello world]]></category>

		<category><![CDATA[instructions]]></category>

		<category><![CDATA[intel]]></category>

		<category><![CDATA[pe]]></category>

		<category><![CDATA[stack]]></category>

		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=21</guid>
		<description><![CDATA[A thing all these programs had in common was their use of the 09h function of INT 21h for printing the "hello, world!" string. But it's time to move forward. Now I plan to use the lovely C printf function.]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="aligncenter" src="http://www.halcode.com/images/gnu320.png" alt="GNU Head" /></p>
<p>Finally, it's time to switch to the fabulous <strong>GNU as</strong>. We'll forget about DEBUG for some time. Thanks DEBUG. <em>GNU as</em>, <em>Gas</em>, or the <em>GNU Assembler</em>, is obviously the assembler used by the GNU Project. It is part of the Binutils package, and acts as the default back-end of gcc. Gas is very powerful and can target several computer architectures. Quite a program, then. As most assemblers, Gas' input is comprised of <strong>directives</strong> (also referred to as Pseudo Ops), <strong>comments</strong>, and of course, <strong>instructions</strong>. Instructions are very dependent  on the target computer architecture. Conversely, directives tend to be relatively homogeneous.</p>
<p><strong>1 Syntax</strong></p>
<p>Originally, this assembler only accepted the AT&amp;T assembler syntax, even for the Intel x86 and x86-64 architectures. The AT&amp;T syntax is different to the one included in most Intel references. There are several differences, the most memorable being that two-operand instructions have  the source and destinations in the opposite order. For example, instruction <code>mov ax, bx</code> would be expressed in AT&amp;T syntax as <code>movw %bx, %ax</code>, i.e., the rightmost operand is the destination, and the  leftmost one is the source. Other distinction is that register names used as operands must be preceded by a  percent (%) sign. However, since version 2.10, Gas supports Intel syntax by means of the .intel_syntax directive. But in the following we'll be using AT&amp;T syntax.</p>
<p><span id="more-21"></span><strong>2 Our Goals</strong></p>
<p>What we'll be doing is to create a new instance of a hello, world! program. Let's recapitulate the articles we've studied so far. First, we presented <a href="http://www.halcode.com/archives/2008/01/21/hello-world/" title="hello world motivation">some reminiscences and motivations for hello, world!</a>. Next, we coded a <a href="http://www.halcode.com/archives/2008/04/17/debugging-hello-world/" title="hello world with DOS DEBUG">hello, world! program by using the MS-DOS DEBUG program</a>. Later, we <a href="http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/" title="Encoding Intel Assembly">encoded such program directly in hexadecimal</a> (no need for DEBUG). And finally, we <a href="http://www.halcode.com/archives/2008/05/04/writing-programs-with-echo-dos/" title="hello world with echo">abused the MS-DOS ECHO command</a> to create a binary, executable hello, world! program directly from the DOS command line (again no need for DEBUG.) A thing all these programs had in common was their use of the 09h function of INT 21h for printing the "hello, world!" string. But it's time to move forward. Now I plan to use the lovely C printf function. In C, our greeting program would be</p>
<pre class="c"><span style="color: #993333;">int</span> main<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
    <a href="http://www.opengroup.org/onlinepubs/009695399/functions/printf.html"><span style="color: #000066;">printf</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">&quot;hello, world!<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #66cc66;">&#41;</span>;
    <span style="color: #b1b100;">return</span> <span style="color: #cc66cc;">0</span>;
<span style="color: #66cc66;">&#125;</span></pre>
<p>We've omitted inclusion of the stdio.h header. We could recur to only one sentence: <code>return printf("hello, world!\n") - 14;</code> but I think that by using two sentences we'll get a clearer code. We save our program in a file called "hello.c", and compile with</p>
<pre>gcc -o hello.exe hello.c</pre>
<p>I'll be working on Windows, with the MinGW port of the GNU Compiler Collection. I like MinGW a lot, specially its ability to provide native functionality via direct Windows API calls, which is good for performance of our applications. Working in Windows means that our executable files (object code and DLLs too) follow the PE/COFF format. The Portable Executable (PE) file format is a wrapper for all the information the Windows loader requires in order to run the code. PE is a modified version of the Unix COFF file format (hence the reference PE/COFF.) Other popular file format for executable code is ELF (Executable and Linkable Format), which is used by Linux, the Nintendo Wii and DS, and the PlayStation 3. For the time being, we only have to know that the behavior of <em>GNU as</em> varies according to the target file format (in our case, PE/COFF.)</p>
<p>gcc can also provide us with the x86 assembly file it used. I typed gcc -S hello.c and this was the output I got:</p>
<pre class="asm">.file    <span style="color: #7f007f;">&quot;hello.c&quot;</span>
    .def    ___main<span style="color: #adadad; font-style: italic;">;    .scl    2;    .type 32;    .endef</span>
    .section .rdata,<span style="color: #7f007f;">&quot;dr&quot;</span>
LC0:
    .ascii <span style="color: #7f007f;">&quot;hello, world!\12\0&quot;</span>
.text
.globl _main
    .def    _main<span style="color: #adadad; font-style: italic;">;    .scl    2;    .type    32;    .endef</span>
_main:
    pushl    %<span style="color: #46aa03; font-weight:bold;">ebp</span>
    movl    %<span style="color: #46aa03; font-weight:bold;">esp</span>, %<span style="color: #46aa03; font-weight:bold;">ebp</span>
    subl    $<span style="color: #ff0000;">8</span>, %<span style="color: #46aa03; font-weight:bold;">esp</span>
    andl    $<span style="color: #ff0000;">-16</span>, %<span style="color: #46aa03; font-weight:bold;">esp</span>
    movl    $<span style="color: #ff0000;">0</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    addl    $<span style="color: #ff0000;">15</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    addl    $<span style="color: #ff0000;">15</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    shrl    $<span style="color: #ff0000;">4</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    <span style="color: #0000ff;">sall</span>    $<span style="color: #ff0000;">4</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    movl    %<span style="color: #46aa03; font-weight:bold;">eax</span>, <span style="color: #ff0000;">-4</span><span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">ebp</span><span style="color: #66cc66;">&#41;</span>
    movl    <span style="color: #ff0000;">-4</span><span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">ebp</span><span style="color: #66cc66;">&#41;</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    <span style="color: #00007f;">call</span>    __alloca
    <span style="color: #00007f;">call</span>    ___main
    movl    $LC0, <span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">esp</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #00007f;">call</span>    _printf
    movl    $<span style="color: #ff0000;">0</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    <span style="color: #00007f;">leave</span>
    <span style="color: #00007f;">ret</span>
    .def    _printf<span style="color: #adadad; font-style: italic;">;    .scl    2;    .type    32;    .endef</span></pre>
<p><strong>3 Code Explanations</strong></p>
<p>From a general view, we identify 3 elements in the above listing. First, we have <strong>directives</strong>, which are symbols beginning with a '.' (dot.) As aforesaid, directives are typically valid for any computer. If the symbol begins with a letter the statement is an assembly language <strong>instruction</strong>, i.e., it will assemble into a machine language instruction, and surely will differ between computer architectures. Finally, labels are those symbols immediately followed by a ':' (colon.) We may think of labels as "directions" for data or code. Now let's do a shallow review of a few germane directives, so bear with me.</p>
<p><strong>.file <em>string</em></strong></p>
<p>This directive identifies the start of the logical file (and <em>string</em> should be the file name.) Actually, the directive is ignored and is only there for compatibility with old versions. We can remove it.</p>
<p><strong>.def <em>name</em> ... .endef</strong></p>
<p>This pair of directives enclose debugging information for the symbol <var>name</var>, and are only observed when <em>Gas</em> is configured for PE/COFF format output. But we don't need it for a simple hello, world! program.</p>
<p><strong>.section <em>name</em></strong></p>
<p>This directive indicates that the following code has to be assembled into a section called <em>name</em>. For PE/COFF targets, the <code>.section</code> directive is used in one of the following ways:</p>
<p><code>.section <em>name</em> [, "<em>flags</em>"]</code><br />
<code>.section <em>name</em> [, <em>subsegment</em>]</code></p>
<p>The gcc's output we've got recurs to the form with <em>flags</em>, and specifically, two flags (single character) are used to indicate the attributes of the section: d (data section) and r (read-only section.) But again, we don't need to explicitly signal section attributes for our simple program.</p>
<p><strong>.ascii <em>"string"</em></strong></p>
<p>Defines one or more string literals (separated by commas.) Each string is assembled into consecutive addresses (with no trailing zero character.)</p>
<p><strong>.text <em>subsection</em></strong></p>
<p>Tells <em>Gas</em> to assemble the following statements onto the end of the text subsection numbered <em>subsection</em>.  If <em>subsection</em> is omitted (as it's our case), subsection number zero is used. Clearly, this directive is mandatory, or Gas will not assemble the code to print our hello, world! message.</p>
<p><strong>.global <em>symbol</em></strong> (or <strong>.globl <em>symbol</em></strong>)</p>
<p><code>.global</code> makes <em>symbol</em> visible to the linker. In our case, we want to inform the linker about the <code>_main</code> function that it is expecting. For compatibility with other assemblers, both spellings (<code>.global</code> or <code>.globl</code>) are valid.</p>
<p>Now, directives are done. After label <code>_main</code> we only have assembly code up to the <code>ret</code> instruction. Some of this code should be clear if you have previous experience with assembly programming. Nevertheless, let's review these instructions too. Note that the 'l' on the end of each mnemonic tells <em>Gas</em> that we want to use the version of the instruction that works with "long" (32-bit) operands.</p>
<p>First 3 instructions are typical code for stack initialization:</p>
<pre class="asm">pushl   %<span style="color: #46aa03; font-weight:bold;">ebp</span>
movl    %<span style="color: #46aa03; font-weight:bold;">esp</span>, %<span style="color: #46aa03; font-weight:bold;">ebp</span>
subl    $<span style="color: #ff0000;">8</span>, %<span style="color: #46aa03; font-weight:bold;">esp</span></pre>
<p>By subtracting 8 bytes from ESP we're reserving the space on the stack to hold local variables (the Intel stack "grows" from high memory locations to the lower ones.) Next we have the rarer</p>
<pre class="asm">andl	$<span style="color: #ff0000;">-16</span>, %<span style="color: #46aa03; font-weight:bold;">esp</span></pre>
<p>Remember that in hexadecimal, -16 is expressed as 0xFFFFFFF0. Therefore, this <code>and</code> aligns the stack with the next lowest 16-byte address. The reasons for this alignment are not very clear to me. It may be a gcc choice in order to accelerate floating point accesses, or it may be for compatibility with a particular architecture. Any of these, we don't require such alignment for displaying hello, world!</p>
<p>The following code is mostly a very contrived way of storing a value in EAX:</p>
<pre class="asm">movl	$<span style="color: #ff0000;">0</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
addl	$<span style="color: #ff0000;">15</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
addl	$<span style="color: #ff0000;">15</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
shrl	$<span style="color: #ff0000;">4</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
<span style="color: #0000ff;">sall</span>	$<span style="color: #ff0000;">4</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
movl	%<span style="color: #46aa03; font-weight:bold;">eax</span>, <span style="color: #ff0000;">-4</span><span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">ebp</span><span style="color: #66cc66;">&#41;</span>
movl	<span style="color: #ff0000;">-4</span><span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">ebp</span><span style="color: #66cc66;">&#41;</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span></pre>
<p>Clearly the code is not optimized as there are a lot of unnecessary lines. Moreover, final EAX's value is also stored into memory previously reserved on the stack. It seems the value in EAX is a parameter for the <code>_alloca</code> invocation in the two following lines:</p>
<pre class="asm"><span style="color: #00007f;">call</span>	__alloca
<span style="color: #00007f;">call</span>	___mai</pre>
<p>These two calls are unnecessary for our toy application. We won't delve into details, but I'll say the <code>alloca()</code> is a function used to allocate memory on the stack. And if PE/COFF binaries are used, and our application has an <code>int main()</code> function, then a function <code>void __main()</code> should be called first thing after entering <code>main()</code>. We'll leave it at that for now. More information can be found in <a href="http://www.osdev.org/mediawiki/index.php?title=How_kernel%2C_compiler%2C_and_C_library_work_together&amp;printable=yes" title="Kernel, compiler and C library" onclick="javascript:pageTracker._trackPageview ('/outbound/www.osdev.org');">this excellent and instructive article from OSDevWiki</a>.</p>
<p>At last, we find the useful code</p>
<pre class="asm">movl    $LC0, <span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">esp</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #00007f;">call</span>    _printf</pre>
<p>It moves the address of the ascii string into the stack, and invokes <code>printf</code>. Now, where's the definition of <code>printf</code>? Well, we'll take it from the C library, of course. The linker (<em>ld</em>) is responsible of associating our code with the definition of <code>printf</code>.</p>
<p>Finally, we found</p>
<pre class="asm">movl    $<span style="color: #ff0000;">0</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
<span style="color: #00007f;">leave</span>
<span style="color: #00007f;">ret</span></pre>
<p>These instructions constitute the "returning code." Store the return value (0 == success!) in EAX, destroy the stack, and pop the saved Instruction Pointer from the stack in order to return control to the calling procedure or program.</p>
<p>If we strip all the unnecessary lines, our hello, world! would acquire this form:</p>
<pre class="asm">.<span style="color: #0000ff;">data</span>
LC0:
    .ascii <span style="color: #7f007f;">&quot;hello, world!\n\0&quot;</span>
.text
    .global _main
_main:
    pushl    %<span style="color: #46aa03; font-weight:bold;">ebp</span>
    movl    %<span style="color: #46aa03; font-weight:bold;">esp</span>, %<span style="color: #46aa03; font-weight:bold;">ebp</span>
    subl    $<span style="color: #ff0000;">4</span>, %<span style="color: #46aa03; font-weight:bold;">esp</span>
    movl    $LC0, <span style="color: #66cc66;">&#40;</span>%<span style="color: #46aa03; font-weight:bold;">esp</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #00007f;">call</span>    _printf
    movl    $<span style="color: #ff0000;">0</span>, %<span style="color: #46aa03; font-weight:bold;">eax</span>
    <span style="color: #00007f;">leave</span>
    <span style="color: #00007f;">ret</span></pre>
<p>Shorter and clearer. I assembled the hard way, step by step:</p>
<pre>as -o hello.o hello.s

ld -o hello.exe
/mingw/lib/crt2.o
C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/crtbegin.o
-LC:/MinGW/bin/../lib/gcc/mingw32/3.4.5
-LC:/MinGW/lib hello.o
-lmingw32 -lgcc -lmsvcrt -lkernel32
C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/crtend.o</pre>
<p>But it's better to just type <code>gcc -o hello.exe hello.s</code> <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/05/11/hello-world-c-and-gnu-as/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Writing Programs with Echo (DOS)</title>
		<link>http://www.halcode.com/archives/2008/05/04/writing-programs-with-echo-dos/</link>
		<comments>http://www.halcode.com/archives/2008/05/04/writing-programs-with-echo-dos/#comments</comments>
		<pubDate>Sun, 04 May 2008 00:11:43 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Assembler]]></category>

		<category><![CDATA[Debug]]></category>

		<category><![CDATA[Languages]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[Retro]]></category>

		<category><![CDATA[assembly]]></category>

		<category><![CDATA[command]]></category>

		<category><![CDATA[dos]]></category>

		<category><![CDATA[echo]]></category>

		<category><![CDATA[editor]]></category>

		<category><![CDATA[hacking]]></category>

		<category><![CDATA[hexadecimal]]></category>

		<category><![CDATA[intel]]></category>

		<category><![CDATA[ms-dos]]></category>

		<category><![CDATA[notepad]]></category>

		<category><![CDATA[opcodes]]></category>

		<category><![CDATA[program]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=19</guid>
		<description><![CDATA[How do you input those characters as parameters for the echo command? I found no way of doing that. If you know a way, please drop me a line.]]></description>
			<content:encoded><![CDATA[<p><strong>Is that possible?</strong> Yes, it is. It's just a matter of redirecting echo output to a file. Writing the program with echo should be a straightforward task if we are able to produce the sequence of characters corresponding to the intended binary, executable file. <strong>Is that useful?</strong> Surely not. But it's a <em>healthy</em> way to waste your time <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> As <a href="http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/#comment-51" title="Assember Text Editor">suggested by a reader</a>, this can be achieved by writing the characters of the executable file, using a simple text editor like notepad or even the old MS-DOS Editor. Of course, the program should be relatively small or we would adventure into the dangerous lands of masochism. By using the echo command of DOS we will be following the conceited style of doing things <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> But we'll restrict this post to the simple hello, world! program we have been reviewing <a href="http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/" title="Assembly Programming">in previous entries</a>.</p>
<p><span id="more-19"></span>The hexadecimal code of our program is:</p>
<pre class="asm">EB <span style="color: #ff0000;">12</span> 0D 0A <span style="color: #ff0000;">68</span> <span style="color: #ff0000;">65</span> 6C 6C 6F 2C <span style="color: #ff0000;">20</span> <span style="color: #ff0000;">77</span> 6F <span style="color: #ff0000;">72</span> 6C <span style="color: #ff0000;">64</span>
<span style="color: #ff0000;">21</span> 0D 0A <span style="color: #ff0000;">24</span> B4 <span style="color: #ff0000;">09</span> BA <span style="color: #ff0000;">02</span> <span style="color: #ff0000;">01</span> CD <span style="color: #ff0000;">21</span> B4 <span style="color: #ff0000;">00</span> CD <span style="color: #ff0000;">21</span> 0D</pre>
<p>Now, we only have to input and redirect these hexadecimal values to a file, that we'll name <code>hello.com</code>. </p>
<p>That would be fairly easy except for some values such as <code>00</code> and <code>09</code>, which represent the NULL and TAB characters, respectively. How do you input those characters as parameters for the echo command? I found no way of doing that. If you know a way, please <a href="http://www.halcode.com/about/" title="Contact me">drop me a line</a>. Therefore, I changed the code of the program in two ways:</p>
<ul>
<li>The <code>09</code> character comes from the instruction <code>mov ah,9</code>. I replaced that by two instructions: <code>mov ah,7</code> and <code>add ah,2</code>. The semantic stays intact, but the contrived approach allows us to discard the 09 character.</li>
<li>Regarding the NULL character (<code>00</code>), it's a consequence of the line <code>mov ah,00</code>. But we can accomplish the effect of clearing <code>ah</code> by executing <code>xor ax,ax</code> instead. And that's it.</li>
</ul>
<p>Take a look at the complete command I used:</p>
<p style="text-align: center;"><img class="aligncenter" src="http://www.halcode.com/images/echo_hello.png" alt="hello, world!" width="674" height="337" /></p>
<p style="text-align: left;">Nice <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/05/04/writing-programs-with-echo-dos/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Encoding Intel x86/IA-32 Assembler Instructions</title>
		<link>http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/</link>
		<comments>http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/#comments</comments>
		<pubDate>Mon, 28 Apr 2008 00:46:55 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Assembler]]></category>

		<category><![CDATA[Debug]]></category>

		<category><![CDATA[History]]></category>

		<category><![CDATA[Retro]]></category>

		<category><![CDATA[16 bits]]></category>

		<category><![CDATA[32 bits]]></category>

		<category><![CDATA[64 bits]]></category>

		<category><![CDATA[8 bits]]></category>

		<category><![CDATA[amd]]></category>

		<category><![CDATA[amd64]]></category>

		<category><![CDATA[assembly]]></category>

		<category><![CDATA[binary]]></category>

		<category><![CDATA[coding]]></category>

		<category><![CDATA[hello world]]></category>

		<category><![CDATA[hexadecimal]]></category>

		<category><![CDATA[ia-32]]></category>

		<category><![CDATA[intel]]></category>

		<category><![CDATA[machine code]]></category>

		<category><![CDATA[ms-debug]]></category>

		<category><![CDATA[opcode]]></category>

		<category><![CDATA[opcodes]]></category>

		<category><![CDATA[x86]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=17</guid>
		<description><![CDATA[Translation of the second line is a direct and solved issue. What about <code>jmp 114</code>? Well, we want to jump over the data (18 bytes, one byte per each character in the string.) IASDM tell us (Appendix B) that the opcode for unconditional jumps in the same segment is 11101011, which in hexadecimal, is expressed as EB.]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">Albeit I decided to write about twice (perhaps once) a week, and so far have only 4 posts, I'm surprised for the amount of readers this blog already has. Thanks a lot to everybody! <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> One of those readers, <a href="http://www.halcode.com/archives/2008/04/17/debugging-hello-world/#comment-21">commenting</a> on the post "<a href="http://www.halcode.com/archives/2008/04/17/debugging-hello-world/">Debugging hello, world</a>" asked about the reason for translating the instruction <code>jmp 114</code> into hexadecimal <code>EB12</code>. To answer this, we are going to recur to the "lovely" and elder Intel Architecture Software Developer Manual (IASDM), Volume 2. This volume describes the instructions set of the Intel Architecture processor (x86/IA-32) and the opcode structure. I'll review some terms involved here:</p>
<ul style="text-align: left;"> <strong>x86</strong>: It refers to the instruction set of the Intel-compatible CPU architectures (chips produced by Intel, AMD, VIA, and others) inaugurated by Intel's original 16-bit 8086 CPU. A decision which proved wise was to make each new instance of x86 processors almost fully backwards compatible.<br />
<strong>IA-32</strong>: It is Intel's 32-bit implementation of the x86 architecture; <em>IA-32</em> distinguishes this implementation from the preceding 16-bit x86 processors. Note that when the 64-bit era arrived, Intel launched its Itanium processor, which discards compatibility with the IA-32 instruction set. Such 64-bit architecture description and implementation is referred to as IA-64, meaning "Intel Architecture, 64-bit", but even though the names are similar, IA-32 and IA-64 are very different architectures and instructions sets. However, AMD's response to Intel 64-bit processors, uses an instruction set that, in essence, is composed of 64-bit extensions to IA-32, i.e., it's a superset of the x86 instruction set. Such instruction set is referred to as AMD64 (initially, x86-64.) Later, Intel cloned it under the name Intel 64. AMD's processors Athlon 64, Terium, Opteron, Sempron, etc., are based on AMD64.<br />
<strong>Opcode</strong>: An opcode (<strong>op</strong>eration <strong>code</strong>) is the part of a <em>machine language</em> instruction (pure binary code) specifying the operation to be performed. The other portion of the instruction is the operand, which is optional and represents the data to be operated on. In assembly language, <em>mnemonics</em> are used to represent the opcodes. Concretely, and according to the IASDM, a mnemonic is a reserved name for a class of instruction opcodes which have the same function. For example, in <code>JMP 114</code>, the mnemonic is <code>JMP</code>, and the operand is 114 (remember, 114 in hexadecimal, which is 276 in decimal.)</ul>
<p><span id="more-17"></span></p>
<p style="text-align: left;">Unlike in high-level languages, there is usually a one-to-one correspondence between basic assembly statements and the binary code of machine language instructions. Nevertheless, in some cases, an assembler may provide pseudo-instructions which expand into several machine language instructions to provide commonly needed functionality. Or no instruction at all, such as <code>DB</code> in</p>
<p style="text-align: center;"><code>db 0d,0a,"hello, world!",0d,0a,"$"</code></p>
<p style="text-align: left;">which directly translates into the sequence of characters (in hexadecimal):</p>
<pre class="asm" style="text-align: center;">0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 110 21 0D 0A 24</pre>
<p>Therefore, pseudo-instruction <code>DB</code> acts only as a data markup for the assembler. Now, for clarity, I'll repeat the code of <a href="http://www.halcode.com/archives/2008/04/17/debugging-hello-world/">Debugging "hello, world"</a> here:</p>
<pre class="asm">- a <span style="color: #ff0000;">100</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">0100</span> <span style="color: #00007f;">jmp</span> <span style="color: #ff0000;">114</span>         <span style="font-style: italic; color: #adadad;">; Jump over the 18 bytes of the string</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">0102</span> <span style="color: #0000ff;">db</span> 0d,0a,<span style="color: #7f007f;">"hello, world!"</span>,0d,0a,<span style="color: #7f007f;">"$"</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">0114</span> <span style="color: #00007f;">mov</span> <span style="font-weight: bold; color: #46aa03;">ah</span>,<span style="color: #ff0000;">9</span>       <span style="font-style: italic; color: #adadad;">; Print function</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">0116</span> <span style="color: #00007f;">mov</span> <span style="font-weight: bold; color: #46aa03;">dx</span>,<span style="color: #ff0000;">102</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">0119</span> <span style="color: #00007f;">int</span> <span style="color: #ff0000;">21</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:<span style="color: #ff0000;">011B</span> <span style="color: #00007f;">mov</span> <span style="font-weight: bold; color: #46aa03;">ah</span>, <span style="color: #ff0000;">0</span>      <span style="font-style: italic; color: #adadad;">; Terminate the program</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:011D <span style="color: #00007f;">int</span> <span style="color: #ff0000;">21</span>
<span style="font-weight: bold; color: #46aa03;">CS</span>:011F
-g =<span style="color: #ff0000;">100</span></pre>
<p>Translation of the second line is a direct and solved issue. What about <code>jmp 114</code>? Well, we want to jump over the data (18 bytes, one byte per each character in the string.) IASDM tell us (Appendix B) that the opcode for unconditional jumps in the same segment is 11101011, which in hexadecimal, is expressed as EB. We need to provide the operand for completing the instruction. In this case, as we want to jump over the string data, our operand is 18 (12 in hexadecimal.) That's why <code>jmp 114</code> translates into EB12. Note that the operand for this <code>jmp</code> specifies the 8-bit <strong>displacement</strong>, i.e., the operand is not an explicit address.</p>
<p>Translation of the other instructions is straightforward, and again we only have to follow the IASDM. Let's analyze encoding of <code>mov ah,9</code> anyway. In this case we have an immediate operand (a constant, 9.) Thus, for moving an immediate operand to a register the encoding adopts this form:</p>
<p style="text-align: center;">1011 w reg : immediate data</p>
<p>There, <em>w</em> represents the bit for operand size. That bit specifies if data is byte or full-sized (where full-sized is either 16 or 32 bits.) As we'll be using 8-bit operands, set the bit to 0. On its side, <em>reg</em> is a 3-bit sequence identifying the destination register. Table B-3 of the IASDM dictates that if <em>w</em> = 0, then register AH is encoded as binary 100. Thus, encoding of <code>mov ah,9</code> is</p>
<pre class="asm" style="text-align: center;">10110100 00001001</pre>
<p>which in hexadecimal is expressed as B409. The next instruction, <code>mov dx,102</code>, follows a similar approach:</p>
<pre class="asm" style="text-align: center;">1011 1 010 0000 0001 0000 0010</pre>
<p>In this case, however, <em>w</em> is set to 1, as the operand 102 requires more than 1-byte storage. The 3-bit sequence for DX is 010. Needless to say, 0000 0001 0000 0010 is the binary representation of the hexadecimal value 102 (16 bits are required). Expressing in hexadecimal, we would have BA0102. However, the bytes for the operand has to be stored in reverse order, and thereby the right encoding for the instruction is BA0201.</p>
<p>Next, <code>INT n</code> (Interruption type n) is encoded as 1100 1101 : type. Therefore, <code>int 21</code> is encoded as 1100 1101 0010 0001 (CD21 in hexadecimal.) And encoding of <code>mov ah, 0</code> as B400 follows directly from our previous explanations. Finally, we can translate our little "hello, world!" into binary code directly:</p>
<pre class="asm">-e <span style="color: #ff0000;">100</span> EB <span style="color: #ff0000;">12</span> 0D 0A <span style="color: #ff0000;">68</span> <span style="color: #ff0000;">65</span> 6C 6C 6F 2C <span style="color: #ff0000;">20</span> <span style="color: #ff0000;">77</span> 6F <span style="color: #ff0000;">72</span> 6C <span style="color: #ff0000;">64</span>
-e <span style="color: #ff0000;">110</span> <span style="color: #ff0000;">21</span> 0D 0A <span style="color: #ff0000;">24</span> B4 <span style="color: #ff0000;">09</span> BA <span style="color: #ff0000;">02</span> <span style="color: #ff0000;">01</span> CD <span style="color: #ff0000;">21</span> B4 <span style="color: #ff0000;">00</span> CD <span style="color: #ff0000;">21</span> 0D
-g =<span style="color: #ff0000;">100</span></pre>
<p>And that's all. I think that my explanations have been clear. But I'm always open to any suggestions and corrections. Thanks for reading.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/04/28/encoding-intel-x86ia-32-assembler-instructions/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Programmers from the Wild West</title>
		<link>http://www.halcode.com/archives/2008/04/20/programmers-from-the-wild-west/</link>
		<comments>http://www.halcode.com/archives/2008/04/20/programmers-from-the-wild-west/#comments</comments>
		<pubDate>Sat, 19 Apr 2008 23:09:01 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Programming]]></category>

		<category><![CDATA[Software Engineering]]></category>

		<category><![CDATA[C]]></category>

		<category><![CDATA[coders]]></category>

		<category><![CDATA[coding]]></category>

		<category><![CDATA[cowboy]]></category>

		<category><![CDATA[cowboy programmers]]></category>

		<guid isPermaLink="false">http://www.halcode.com/archives/2008/04/20/programmers-from-the-wild-west/</guid>
		<description><![CDATA[Analysis, Design, and related topics are for sissies, and for allowing professors of Computer Science who are bad at mathematics to make a living. SDLC is a pony. Cowboys ride horses.]]></description>
			<content:encoded><![CDATA[<p>We all know what happens when a project's deadline is not met. Besides firing someone, hard, dry heroes appear. Lonesome, ruthless and distrustful heroes which brings the peace only revolvers can conquer. Sometimes, the guys with the money hire them as the ultimate <em>saviors</em>: they have bothered to come here, from the farthest west, to rescue the project. They are irresistible: they are the <strong>cowboy</strong> programmers. It's men's time.</p>
<p style="text-align: center"><img src="http://www.halcode.com/images/bonanza.jpg" alt="Ben Cartwright &amp; Sons" width="320" height="270" /></p>
<p><span id="more-16"></span><br />
But wait a minute. I remember that, according to Brooks' law, "adding people to a late software project makes it later."</p>
<p><strong>Cowboy wisdom #1</strong>: Yeah. But "if you don't add people to a late software project then it has been canceled."</p>
<p>Do they indeed come from the farthest west? Not necessarily. Perhaps they have been members of the team since the start. Frequently, cowboy (cowgirl) programmers are not external newcomers. They are hidden (or not) in the core of the team. And they are important contributors to the flaws of the project. Blame managers, though, for accepting them in and for trusting their nonsense advices.</p>
<p>However, I've also seen a lot of fellow programmers to crumble under the project's pressure. In such circumstances, they awaken their hidden alter ego, and happily wear their hats.  I have met several of such outlaws, and in the following, I pay tribute to this lovely and legendary figure of programming.</p>
<p><strong>&gt;&gt; The Truth about Software Development Life Cycle (SDLC) </strong></p>
<p>Cowboy programmers are the absolute reference for everything. Masters of all trades (although they don't have any clue about SDLC, but who cares?) Analysis, Design, and related topics are for sissies, and for allowing professors of Computer Science who are bad at mathematics to make a living. SDLC is a pony. Cowboys ride horses.</p>
<p>Requirements Elicitation anyone? What for? Cowboy programmers know what customers want. Besides, filling myriad pages with arrows and boxes is a complete waste of time: we should be coding instead! That's why the project is late!</p>
<p><strong>Cowboy wisdom #2</strong><strong>:</strong> Reading too much Software Engineering books has rot your mind. Welcome to life, <em>boy</em>.</p>
<p><strong>&gt;&gt; Learning to Code</strong></p>
<p>Cowboy programmers have no time for thinking. Their code is rushed, but effective. Optimizations are for compilers. They just get the things done. How to reverse a string? Here's your answer, enjoy it:</p>
<pre class="c"><span style="color: #993333;">char</span>* rev_str<span style="color: #66cc66;">&#40;</span><span style="color: #993333;">char</span>* str<span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
  <span style="color: #993333;">int</span> str_l = strlen<span style="color: #66cc66;">&#40;</span>str<span style="color: #66cc66;">&#41;</span>;
  <span style="color: #993333;">char</span>* r = <span style="color: #66cc66;">&#40;</span><span style="color: #993333;">char</span>*<span style="color: #66cc66;">&#41;</span>malloc<span style="color: #66cc66;">&#40;</span>str_l<span style="color: #66cc66;">&#41;</span>;
  <span style="color: #993333;">int</span> i =  str_l<span style="color: #cc66cc;">-1</span>;
  <span style="color: #b1b100;">while</span> <span style="color: #66cc66;">&#40;</span>*str<span style="color: #66cc66;">&#41;</span>
  <span style="color: #66cc66;">&#123;</span>
    r<span style="color: #66cc66;">&#91;</span>i<span style="color: #66cc66;">&#93;</span> = *str;
    str++;
    i--;
  <span style="color: #66cc66;">&#125;</span>
  r<span style="color: #66cc66;">&#91;</span>str_l<span style="color: #cc66cc;">-1</span><span style="color: #66cc66;">&#93;</span> = <span style="color: #cc66cc;">0</span>;
  <span style="color: #b1b100;">return</span> r;
<span style="color: #66cc66;">&#125;</span></pre>
<p>Please, don't rush into believing that Cowboy Programmers don't comment their code. They do. And they do it excessively. The main difference, perhaps, is that their comments are what is known as <em>auxiliary comments</em>. Let me explain. Imagine that our cowboy programmer finally accepts that his code contains a bug, and decides to use a function of some library. These are the germane comments our cowboy introduces in his code:</p>
<pre class="c"><span style="color: #993333;">char</span>* rev_str<span style="color: #66cc66;">&#40;</span><span style="color: #993333;">char</span>* str<span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
  <span style="color: #808080; font-style: italic;">/*
  int str_l = strlen(str);
  char* r = (char*)malloc(str_l);
  int i =  str_l-1;
  while (*str)
  {
    r[i] = *str;
    str++;
    i--;
  }
  r[str_l-1] = 0;
  */</span>
  <span style="color: #b1b100;">return</span> lib_str_reverse<span style="color: #66cc66;">&#40;</span>str<span style="color: #66cc66;">&#41;</span>;
<span style="color: #66cc66;">&#125;</span></pre>
<p>Note how the old code turns into <em>auxiliary comments</em> which nicely explains what lib_str_reverse(char*) does.</p>
<p><strong>Cowboy wisdom #3</strong><strong>:</strong> <span class="pro002">He that is without sin among you, let him first cast a stone...</span></p>
<p><strong>&gt;&gt; The Dusty Road to Testing</strong></p>
<p>For the cowboy programmer, the notion of "testing" implies the possibility of his code being buggy. Only people with poor self-confidence do tests. People who think of performing tests should be politely removed from the project. If we were to run tests in our project, our cowboy could not make the smallest change in the project without rerunning all sorts of tests. The project is late, remember?</p>
<p>Cowboy programmers always believe that their ideas and code are perfect. Any <em>fiasco</em> in the deliverables is a consequence of bugs in the code of someone else. Fear not, our cowboy even has solutions for such problems.<strong><br />
</strong></p>
<p><strong>Cowboy wisdom #4</strong>: Did you learn in school what a <em>watchdog</em> is? It's a program that restarts the hardware if it seems to be wedged. Do you know what was the first piece of code I wrote just arriving here? Do you know?</p>
<p><strong>&gt;&gt; Spawns Everywhere</strong></p>
<p>Finally, other important trait of cowboy programmers is their disrespect for the code of other programmers. They will introduce changes everywhere, at will, always bitching about others' incompetency. Their code will dangerously spread.</p>
<p><strong>Cowboy wisdom #5</strong>: I'm a mac daddy.</p>
<p>Nowadays, they are very, very healthy. Long life the cowboys.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/04/20/programmers-from-the-wild-west/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Debugging &#8220;hello, world&#8221;</title>
		<link>http://www.halcode.com/archives/2008/04/17/debugging-hello-world/</link>
		<comments>http://www.halcode.com/archives/2008/04/17/debugging-hello-world/#comments</comments>
		<pubDate>Wed, 16 Apr 2008 21:17:05 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Assembler]]></category>

		<category><![CDATA[Debug]]></category>

		<category><![CDATA[History]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[Retro]]></category>

		<category><![CDATA[assembly]]></category>

		<category><![CDATA[computing]]></category>

		<category><![CDATA[hello world]]></category>

		<guid isPermaLink="false">http://www.halcode.com/archives/2008/04/17/debugging-hello-world/</guid>
		<description><![CDATA[The Go command (g) will run the program starting at the given address (in this case, CS:0100) If everything goes right, the program should output the intended "hello, world!" string, and finish with the message "Program terminated normally."]]></description>
			<content:encoded><![CDATA[<p>Yesterday, we took a break after long hours of intensive coding, and a coworker started establishing similarities between our current frantic coding and the (fortunately) gone days of college homework. I specifically recalled a project I had to build by using <a href="http://en.wikipedia.org/wiki/DEBUG_%28DOS_Command%29" title="Microsoft DEBUG" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">MS-DEBUG</a>: a simple calculator in assembly, which also required the hassle of dealing with pretty and safe user input. I have no intention of looking for such listings, but I thought about revisiting, for a moment, the old and dear friend MS-DEBUG <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> I'll harness the <a href="http://www.halcode.com/archives/2008/01/21/hello-world/" title="hello world reminiscences">first post of this blog</a>, and try to build a little 'hello, world' program in MS-DEBUG. I know this has little value outside a personal feeling and a tad of nostalgia, maybe.</p>
<p>I remember that <code>a 100</code> tells DEBUG to accept code in memory starting out from <code>CS:0100</code>. Ok. </p>
<p>Now comes the data, which here simply consist of the string <code>'hello, world!'</code>. A neat output, however, would require <em>newlines</em> before and after our intended string. A newline is comprised of a carriage return (CR = ASCII 13) and a line feed (LF = ASCII 10) on the display. As DEBUG only understand hexadecimal numbers, we must use 0Dh and 0Ah for CR and LF, respectively. Except in code,  we will represent hexadecimal numbers by following the value with 'h'.) Fortunately, DEBUG also admits ASCII characters directly (and the pseudo-instructions DB and DW!), so we can express our complete string as</p>
<p align="center"><code>db 0d,0a,"hello, world!",0d,0a,"$"</code></p>
<p><span id="more-12"></span><br />
The "$" requires an explanation. In order to print our string in STDOUT, we are going to recur to the <a href="http://spike.scu.edu.au/~barry/interrupts.html#ah09" title="Function 09 of INT 21h" onclick="javascript:pageTracker._trackPageview ('/outbound/spike.scu.edu.au');">function 09h of INT 21h</a>. This function prints all the characters in memory, beginning at DX and finishing when the "$" (24h) sign is encountered (i.e., "$" acts as the zero in C strings.) Finally, we invoke function 00h of INT 21h for terminating the program after the string is echoed. And that's it.</p>
<pre class="asm">- a <span style="color: #ff0000;">100</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0100</span> <span style="color: #00007f;">jmp</span> <span style="color: #ff0000;">114</span>         <span style="color: #adadad; font-style: italic;">; Jump over the 18 bytes of the string</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0102</span> <span style="color: #0000ff;">db</span> 0d,0a,<span style="color: #7f007f;">&quot;hello, world!&quot;</span>,0d,0a,<span style="color: #7f007f;">&quot;$&quot;</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0114</span> <span style="color: #00007f;">mov</span> <span style="color: #46aa03; font-weight:bold;">ah</span>,<span style="color: #ff0000;">9</span>       <span style="color: #adadad; font-style: italic;">; Print function</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0116</span> <span style="color: #00007f;">mov</span> <span style="color: #46aa03; font-weight:bold;">dx</span>,<span style="color: #ff0000;">102</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0119</span> <span style="color: #00007f;">int</span> <span style="color: #ff0000;">21</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">011B</span> <span style="color: #00007f;">mov</span> <span style="color: #46aa03; font-weight:bold;">ah</span>, <span style="color: #ff0000;">0</span>      <span style="color: #adadad; font-style: italic;">; Terminate the program</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:011D <span style="color: #00007f;">int</span> <span style="color: #ff0000;">21</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:011F
-g =<span style="color: #ff0000;">100</span></pre>
<p>The Go command (g) will run the program starting at the given address (in this case, CS:0100) If everything goes right, the program should output the intended "hello, world!" string, and finish with the message "Program terminated normally."</p>
<p>You may save this program to a folder in your hard drive (I'll use c:\tmp). Simply input:</p>
<pre class="asm">-n <span style="color: #0000ff;">c</span>:\tmp\hello.com
-rcx
<span style="color: #46aa03; font-weight:bold;">CX</span> <span style="color: #ff0000;">0000</span>
:<span style="color: #ff0000;">20</span>
-w</pre>
<p><code>-rcx</code> allows to change the content of the CX register. We store there the value "20" which is the size of our program (why? note that our program occupies exactly 32 bytes... "32" is "20" in hexadecimal.) Write (w) uses CX for knowing how much bytes it has to save.</p>
<p>Now, you can dump memory contents (d) and view your program in hexadecimal (or open hello.com with a hexadecimal viewer):</p>
<pre class="asm">-d
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0100</span> EB <span style="color: #ff0000;">12</span> 0D 0A <span style="color: #ff0000;">68</span> <span style="color: #ff0000;">65</span> 6C 6C 6F 2C <span style="color: #ff0000;">20</span> <span style="color: #ff0000;">77</span> 6F <span style="color: #ff0000;">72</span> 6C <span style="color: #ff0000;">64</span>
<span style="color: #46aa03; font-weight:bold;">CS</span>:<span style="color: #ff0000;">0110</span> <span style="color: #ff0000;">21</span> 0D 0A <span style="color: #ff0000;">24</span> B4 <span style="color: #ff0000;">09</span> BA <span style="color: #ff0000;">02</span> <span style="color: #ff0000;">01</span> CD <span style="color: #ff0000;">21</span> B4 <span style="color: #ff0000;">00</span> CD <span style="color: #ff0000;">21</span> 0D</pre>
<p>The hexadecimal patterns are very clear. For example, 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 21 is our "hello, world!" string. 0D and 0A are CR and LF, respectively. CD 21 is INT 21h. B4 09 is MOV AH,9. And so further.</p>
<p>Finally, remember that you may use Enter (e) to write your code directly into memory:</p>
<pre class="asm">-e <span style="color: #ff0000;">100</span> EB <span style="color: #ff0000;">12</span> 0D 0A <span style="color: #ff0000;">68</span> <span style="color: #ff0000;">65</span> 6C 6C 6F 2C <span style="color: #ff0000;">20</span> <span style="color: #ff0000;">77</span> 6F <span style="color: #ff0000;">72</span> 6C <span style="color: #ff0000;">64</span>
-e <span style="color: #ff0000;">110</span> <span style="color: #ff0000;">21</span> 0D 0A <span style="color: #ff0000;">24</span> B4 <span style="color: #ff0000;">09</span> BA <span style="color: #ff0000;">02</span> <span style="color: #ff0000;">01</span> CD <span style="color: #ff0000;">21</span> B4 <span style="color: #ff0000;">00</span> CD <span style="color: #ff0000;">21</span> 0D
-g =<span style="color: #ff0000;">100</span></pre>
<p>Pretty, uh? <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/04/17/debugging-hello-world/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A Review of OpenGL Programming on Mac OS X</title>
		<link>http://www.halcode.com/archives/2008/04/12/a-review-of-opengl-programming-on-mac-os-x/</link>
		<comments>http://www.halcode.com/archives/2008/04/12/a-review-of-opengl-programming-on-mac-os-x/#comments</comments>
		<pubDate>Fri, 11 Apr 2008 23:27:42 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[Mac]]></category>

		<category><![CDATA[OpenGL]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[apple]]></category>

		<category><![CDATA[games]]></category>

		<category><![CDATA[mac os x]]></category>

		<guid isPermaLink="false">http://www.halcode.com/archives/2008/04/12/a-review-of-opengl-programming-on-mac-os-x/</guid>
		<description><![CDATA[All of the explanations are crystal clear, focused into the concepts and techniques OpenGL developers <em>really</em> need. The book comprises OpenGL architecture and configuration on OS X, and the various APIs we can use in order to create OpenGL applications, specifically, CGL, AGL, Cocoa, (our old buddy) GLUT, and X11 APIs.]]></description>
			<content:encoded><![CDATA[<p style="text-align: center"><img src="http://www.halcode.com/images/oglmacosx.jpg" alt="OpenGL Programming on Mac OS X" width="380" height="498" /></p>
<p>The full title of this book is "<strong>OpenGL Programming on Mac OS X: Architecture, Performance and Integration</strong>." Its fortunate authors are Robert P. Kuehne and J. D. Sullivan, two professionals who thoroughly know what they are talking about. Moreover, the book has been published by one of my favorites, Addison-Wesley. Therefore, success in conveying the details of OpenGL Programming on the Apple platform seems guaranteed. After reading it, I confirmed that any graphics programmer will learn a lot of things from this book. And nowadays, with a market saturated by rushed books, it's a bliss.<br />
<span id="more-15"></span><br />
As suggested, this book has plenty of shining points, but there is a germane disclaimer: <em>this is not a book for starting to learn OpenGL</em>. As stated in the book's first pages, it's aimed at two categories of programmers:</p>
<ol>
<li>Mac developers in general,</li>
<li>and those with OpenGL foundations who want to explore the enormous benefits of OpenGL development on Mac OS X.</li>
</ol>
<p>I do strongly believe that any OpenGL developer will benefit of studying this great book. However, I don't know of any good book for <em>learning OpenGL from scratch</em>. I can point you to the Red Book or even the OpenGL SuperBible, which are superb references for novices, but both exhibit some important pedagogical flaws. On the contrary, "OpenGL Programming for Mac OS X" is indeed a very good book from the pedagogical standpoint, but it has a somewhat advanced level, covering very specific topics:</p>
<ol>
<li>Mac OpenGL Introduction</li>
<li>OpenGL Architecture on OS X</li>
<li>Mac Hardware Architecture</li>
<li>Application Programming on OS X</li>
<li>OpenGL Configuration and Integration</li>
<li>The CGL API for OpenGL Configuration</li>
<li>The AGL API for OpenGL Configuration</li>
<li>The Cocoa API for OpenGL Configuration</li>
<li>The GLUT API for OpenGL Configuration</li>
<li>API Interoperability</li>
<li>Performance</li>
<li>Mac Platform Compatibility</li>
<li>OpenGL Extensions</li>
</ol>
<p>The book also contains 4 useful appendices:</p>
<ol>
<li>X11 APIs for OpenGL Configuration</li>
<li>Glossary</li>
<li>The Cocoa API for OpenGL Configuration in Leopard, Mac OS X 10.5</li>
<li>Bibliography</li>
</ol>
<p>More resources should be available on the <a href="http://www.macopenglbook.com" title="Mac OpenGL book" onclick="javascript:pageTracker._trackPageview ('/outbound/www.macopenglbook.com');">book's companion website</a>. At the time of this writing, you can download the source code from that page as a locked zipped file. By the way, it's interesting (to say the least) the authors' decision of password-protecting their sources <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>In respect to the content, all of the explanations are crystal clear, focused into the concepts and techniques OpenGL developers <em>really</em> need. The book comprises OpenGL architecture and configuration on OS X, and the various APIs we can use in order to create OpenGL applications, specifically, CGL, AGL, Cocoa, (our old buddy) GLUT, and X11 APIs. A chapter focused into API Interoperability is also included, and is very complete. But there is much more information in this book: history notes, a germane review of Mac's hardware, OS X programming, compatibility between Mac platforms, and a discussion about OpenGL extensions. The glossary in the appendices proves to be very useful, and so are the notes about Cocoa API for OpenGL in Leopard.</p>
<p>Personally, Chapter 11 is the one I've enjoyed the most. The technical wisdom revealed in such chapter almost justifies by itself the full cost of the book. It's such a fine chapter. The almost 5 pages covering the "Axioms for Designing High-Performance OpenGL Applications" are very interesting, particularly the care we must have when doing our OpenGL drawing in Object-Oriented programs; we could easily incur considerable glVertex overhead, if our code is not properly structured. The little tutorial section "Putting It All Together" includes a detailed optimization of an OpenGL program, "Please Tune Me". It's just the kind of practical stuff that programmers love. Delicious. Very Recommended.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/04/12/a-review-of-opengl-programming-on-mac-os-x/feed/</wfw:commentRss>
		</item>
		<item>
		<title>hello, world</title>
		<link>http://www.halcode.com/archives/2008/01/21/hello-world/</link>
		<comments>http://www.halcode.com/archives/2008/01/21/hello-world/#comments</comments>
		<pubDate>Sun, 20 Jan 2008 20:30:46 +0000</pubDate>
		<dc:creator>Jose</dc:creator>
		
		<category><![CDATA[C]]></category>

		<category><![CDATA[History]]></category>

		<category><![CDATA[Languages]]></category>

		<category><![CDATA[Pascal]]></category>

		<category><![CDATA[Programming]]></category>

		<category><![CDATA[Retro]]></category>

		<category><![CDATA[B]]></category>

		<category><![CDATA[code]]></category>

		<guid isPermaLink="false">http://www.halcode.com/?p=1</guid>
		<description><![CDATA[Personally, by reading "hello, world", I evoke orange and warm afternoons, with my eyes strained (and soothed) by code. Nice, and overly inefficient Pascal code. In some images, a few BASIC snippets interleave, but those are not that nice to remember...]]></description>
			<content:encoded><![CDATA[<p>In calm thoughts, these two words (with the comma) bring to mind plenty of images. More often that not, I hold "hello, world" in fond remembrances. For this post I've slightly modified the default WordPress post title, in favor of the original <a href="http://en.wikipedia.org/wiki/Kernighan" title="Brian Kernighan" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Kernighan</a>'s form: no capitalization and presence of comma. Through the years, it seems to me that this sequence lightens my worries when coping with new languages, systems, things. Somehow, the mind has understood that once "hello, world" is done, then reaching the entire system is achievable. Kind of Pavlovian Conditioning, I guess.</p>
<p>In K&amp;R’s C Tutorial, this <em>feel at ease</em> perception it's also intended:</p>
<blockquote><p>The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages: Print the words <em>hello, world</em>. This is the basic hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went.</p></blockquote>
<p>This way, "hello, world" should be our first step for pummeling through the new beast (language).<br />
<span id="more-1"></span><br />
<strong>Origins and B detour<br />
</strong></p>
<p>In respect to its origins, the <a href="http://cm.bell-labs.com/cm/cs/who/dmr/btut.html" title="Tutorial Programming Language B" onclick="javascript:pageTracker._trackPageview ('/outbound/cm.bell-labs.com');">Tutorial Introduction to the Language B</a>, by Kernighan, contains the primigenial use of our salutation, under the section "External Variables":</p>
<pre class="c">main<span style="color: #66cc66;">&#40;</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
extrn a, b, c;
putchar<span style="color: #66cc66;">&#40;</span>a<span style="color: #66cc66;">&#41;</span>; putchar<span style="color: #66cc66;">&#40;</span>b<span style="color: #66cc66;">&#41;</span>; putchar<span style="color: #66cc66;">&#40;</span>c<span style="color: #66cc66;">&#41;</span>; putchar<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'!*n'</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #66cc66;">&#125;</span>
&nbsp;
a <span style="color: #ff0000;">'hell'</span>;
b <span style="color: #ff0000;">'o, w'</span>;
c <span style="color: #ff0000;">'orld'</span>;</pre>
<p>I recognize a lot of familiar things going on here in B:</p>
<ol>
<li>First, our beloved <code>main</code> is already incarnated.</li>
<li>Parentheses, brackets, semicolons and the syntax for passing arguments are right in their position.</li>
<li>And <code>extrn</code> -&gt; <code>extern</code>.</li>
<li>Clearly, that <code>'*n'</code> resembles <code>'\n'</code>: notice that the literal passed to <code>putchar</code> is <code>'!*n'</code> (comprised of two characters) and thereby the exact, original output is "hello, world!", with the "!".</li>
<li>Interestingly, B is typeless and not suited for numeric computation, so no <code>int</code>, no <code>float</code>.</li>
<li>Besides, <code>a</code>, <code>b</code> and <code>c</code> are global variables (static storage), which means they can be initialized. But for any B function being able to access such global variables, use of <code>extrn</code> declaration is mandatory... delightful. Now, albeit the language does not include a mechanism for explicit types, the three variables in the example are initialized to character constants, which in B are single-quoted and can have from one to four ascii characters (in C a character constant is formed by enclosing a <em>single character</em> from the representable character set within single quotation marks). By the way, the upper bound of four characters is a hardware restriction, as <a href="http://en.wikipedia.org/wiki/Stephen_C._Johnson" title="Stephen Johnson" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Stephen Johnson</a> points out in the <a href="http://cm.bell-labs.com/cm/cs/who/dmr/bref.html" title="Users' Reference to B on MH-TSS" onclick="javascript:pageTracker._trackPageview ('/outbound/cm.bell-labs.com');">Users' Reference to B on MH-TSS</a>:</li>
</ol>
<blockquote><p>A character constant is represented by <code>'</code> followed by one or more characters (possibly escaped) followed by another <code>'</code>. It has an rvalue equal to the value of the characters packed and right adjusted, with zero fill.  Obviously, the number of characters in a character constant is a machine dependent quantity; on the H6070,  up to four characters are allowed.</p></blockquote>
<p>B looks a lot like C, and had its roots in an older language called <a href="http://en.wikipedia.org/wiki/BCPL" title="BCPL" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">BCPL</a>. One of its authors, <a href="http://en.wikipedia.org/wiki/Ken_Thompson_%28programmer%29" title="Ken Thompson" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Ken Thompson</a>, when asked if B was a subset of BCPL, <a href="http://hopl.murdoch.edu.au/showlanguage2.prx?exp=492" title="Thompson interview" onclick="javascript:pageTracker._trackPageview ('/outbound/hopl.murdoch.edu.au');">answered this</a>:</p>
<blockquote><p>It wasn't a subset. It was almost exactly the same. It was a interpreter instead of a compiler. It had two passes. One went into intermediate language and which one was the interpreter of the intermediate language. Dennis wrote a compiler for B, that worked out of the intermediate language. It was very portable and in less than a day you could get very versatile (not clear). Typically the interpreter was a set macros for your interpreter, they were very field orientated and you just define these macros with these fields and then write a little interpreter that would switch the set routines, and you had to write about twenty three-line routines, and it would run. And it was very small, very clean. It was the same language as BCPL, it looked completely different, syntactically it was, you know, a redo. The semantics was exactly the same as BCPL. And in fact the syntax of it was, if you looked at, you didn't look too close, you would say it was C. Because in fact it was C, without types. There's no word like interchar or struct or anything like that. The word for... There was a word for extern, which means to declare an external thing. There was a word auto, which declared an auto thing.</p></blockquote>
<p>"Dennis" of course is the other author (and C's creator), <a href="http://en.wikipedia.org/wiki/Dennis_Ritchie" title="Dennis Ritchie" onclick="javascript:pageTracker._trackPageview ('/outbound/en.wikipedia.org');">Dennis Ritchie</a>. Chapter 1 of The C Programming Language by Kernighan and Ritchie, includes the clearer version:</p>
<pre class="c">main<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#123;</span>
<a href="http://www.opengroup.org/onlinepubs/009695399/functions/printf.html"><span style="color: #000066;">printf</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">&quot;hello, world<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #66cc66;">&#125;</span></pre>
<p>Now the compiler does more things for you. Good.</p>
<p><strong>Reminiscences</strong></p>
<p>Personally, by reading "hello, world", I evoke orange and warm afternoons, with my eyes strained (and soothed) by code. Nice, and overly inefficient Pascal code. In some images, a few BASIC snippets interleave, but those are not that nice to remember <img src='http://www.halcode.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> That was not too long ago. I had to complete projects and projects in Pascal for some undergraduate courses, which amounted to lots of tangled <em>pasta</em>. Lovely functions and procedures which required scrolling several pages in order to reach the closing <code>end</code>. Today, I wonder how I was able to bear all that insanely long units. Miracles of youth. I do remember having one of this procedures for drawing a primitive text-mode GUI, based on ALT +196 and ALT + 179: <code>Write('─')</code> and<code> Write('│')</code>. Thanks Turbo Pascal, for so much blessings. Such procedure was flooded with this:</p>
<pre class="c"><span style="color: #66cc66;">&#123;</span> Draw a line <span style="color: #66cc66;">&#125;</span>
<span style="color: #b1b100;">For</span> Loop := <span style="color: #cc66cc;">1</span> To <span style="color: #cc66cc;">80</span> <span style="color: #b1b100;">Do</span>
Write<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'─'</span><span style="color: #66cc66;">&#41;</span>;</pre>
<p>You've gotta love the obnoxious formatting (including the capitalization of <code>To</code> and <code>Do</code>), the dumb comment, the shining constants, and the good programming practice that such <code>For</code>-flooded procedures are. And for clarity, I omitted that such <code>For</code> was inside a super-nested <code>If</code>. Beautiful. Undoubtedly, spaghetti code was my premature approach to cryptography. That code, by the way, reminds me of the <a href="http://www.gnu.org/fun/jokes/helloworld.html" title="hello world jokes" onclick="javascript:pageTracker._trackPageview ('/outbound/www.gnu.org');">First year in college</a> style ("hello, world" GNU joke).</p>
<p>I used to think that Pascal was a cool language (I still do). But other work by Kernighan, again and perhaps unwillingly, made explicit to me that the distance between boys and men was greater than I had thought. Dated April 2, 1981 (incidentally, a few days before I was born) <a href="http://www.lysator.liu.se/c/bwk-on-pascal.html" title="Why Pascal is not my favorite language" onclick="javascript:pageTracker._trackPageview ('/outbound/www.lysator.liu.se');">Why Pascal is Not My Favorite Programming Language</a> shed new light on my lag. From there on, Pascal lost some charm, and some of my spare time. But I would not say it's not my favorite language... I'd prefer to say that it's one of my favorite  languages, with C being the most liked (so far). The inconsistencies of affection...</p>
<p>A little homage to a great buddy, and first post. "hello, world".</p>
]]></content:encoded>
			<wfw:commentRss>http://www.halcode.com/archives/2008/01/21/hello-world/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
