Header image at HALCODE

In Web We Trust

September 18th, 2008

Spíder's Web

The BBC has an interesting news report, Warning sounded on web's future, on the worries of Tim Berners-Lee about the spreading of disinformation using the web. The article also touches on other related topics such as extending the reach of the web and improving the web's usability.

Regarding the problem of publishing insincere or unfounded information on the web, Berners-Lee cites a recent campaign started because of the activation of the Large Hadron Collider. Specifically, some groups used the web for communicate invalid beliefs about the world being destroyed by effect of the LHC. This use of the web is, in first instance, unethical. Further, it may transmit fears, and additionally, the information posted by such groups has no scientific grounds. This a typical example of misusing a system.

This kind of problem arises because of the web's intrinsic nature. The web is a different system... a system which is now inevitably linked to the behavior and trends of human society. The web is an open and complex system. It's open because we can add as much information as we want, and it's complex because it's composed of a huge amount of producers and consumers of information, and their interactions. There is other factor that shapes the web complexity: evolution

The web changes at very high rates. The web's technologies an content change so quickly, that it's impossible for any person to assimilate all this evolution. After all, the user only wants to open her browser and read the news. Typically, she is interested neither in the technical details behind the publishing platform, nor in the path followed by the information until arriving to the final form on the online media. There is a tool the user recurs to in order to fight complexity: trust.
Continue →

Published in Computer Science  |  1 Comment


Computer Science Questions 2

September 14th, 2008

Again, I'll be answering some questions I've received in my inbox, just like in the previous post of this series.

  1. I want to know how to declare a const variable in one file and access it from other files? (C++) by Ricardo Orozco (Venezuela).

    It's a fairly basic question, and reveals that you have to study more C++. What you want is to define a const variable at global scope. Unlike non-const variables (which are extern by default), const variables are local to the file in which they are defined. Therefore, you cannot access them from other files unless you specify that the variable is extern.

    For instance, if you specify extern when defining the variable

    bufferSize in file1.ccextern const int bufferSize = 512;

    you can access bufferSize from any other file, say, file2.cc

    extern const int bufferSize; // we are using bufferSize from file1.cc

  2. I'm just starting to study speech synthesis, and I'm faced with the calculus of resonances in an uniform acoustic tube. The glottal end is closed, but the mouth is open. Could you expand on this? by Bakuman.

    This is the configuration you are studying:

    The acoustic tube is uniform, and its length is L. The glottis, located at x=-L, is closed (infinite impedance) and the mouth, located at x=0, is open (impedance zero). Now, pressure variation p(x) along this uniform acoustic tube is expressed as:

    \frac{d^2p}{dx^2} + \left(\frac{2\pi f}{c}\right)^2p = 0 ~~(I)

    where f represents frequency in Hz, and c is the speed of sound: 3.53 \times 10^4 cm/s at 37° C.

    According to the boundary conditions (the impedances at both ends), the solution is:

    p(x) = P_m \sin{\frac{2\pi f}{c}x} ~~(II)

    where P_m is the peak in sound pressure. On the other hand, we have a relation between pressure and volume velocity

    \frac{dp}{dx} = -\frac{j2\pi f \rho}{A}U ~~(III)

    A is a constant representing the tube's area. Now, volume velocity can be expressed as

    U(x) = jP_m \frac{A}{\rho c} \cos{\frac{2\pi f}{c}x} ~~(IV)

    where \rho equals the average atmospheric density (1.14 \times 10^{-3} gm / cm ^ 3 at 37°C).

    As U(−L) = 0, resonances Fn of the acoustic tube are

    Fn = \frac{2n - 1}{4}\frac{c}{L} ~~(V)

    where n=1, 2, 3... And that's it. We can see that the area function does not affect the location of resonances. Finally, remember that, in average, the male oral tract has a length of 16.9 cm, and the female tract has an average length of 14.1 cm.

  3. What's an stub? by Dani Hoffmann (Colombia).

    It really, really depends on the context. Your question is too ambiguous. A Stub may even be a relative of the Danish poet Ambrosius Stub. After all, code is poetry.

    In computing, I know of 4 contexts where the word stub has a well-established meaning:

    1. Web Sites: A stub is a web page in progress, i.e., a page which provides minimal information and is intended for later development. For instance, a Wikipedia stub is a short article in need of expansion.
    2. Coding: During development, we sometimes use a "skeleton" function (or procedure, or method) to simulate some intended (but not yet implemented) functionality. For instance, the function may stand in for a complex algorithm to be developed later, or simulate a procedure running on a remote host. Such placeholder function is called a stub function. Stub functions come in handy for quick prototyping and testing.
    3. Distributed Systems: In distributed systems, a service interface defines the services available to programs. These services are distributed among several networked machines. In distributed systems, a program in machine A may request a service by calling a procedure. However, the procedure may be offered by a remote host, say, machine B. Remote Procedure Calls (RPC) are a paradigm of distributed systems aimed at abstracting the communication between hosts in a network. The goal of RPCs is to hide the details of the remote call. The remote call should look like a local one, i.e., the program in machine A would invoke the procedure in machine B as it would invoke a procedure locally. Under the hood, though, it's obvious that we have to transmit information from the client (caller) to the server (callee), and in the other direction. Now, how to hide the fact that we are calling a procedure located in other machine? This is the basic idea of RPCs:
      • In the address space of the client, we represent the server procedure by means of a local procedure called client stub. Likewise, the server is also linked to a server stub, which will receive the message from the client stub.
      • When machine A requests a service which is provided by machine B, a call is made to the client stub (which has the same name as the procedure in B). As the client stub lies in the same address space of the caller, the invocation is handled locally, and the program sees this invocation as a local one. However, the client stub marshalls the received parameters and sends them, throught the network, to the server stub. In turn, the server stub unmarshalls the parameters and perform the call to the real procedure in the server. When the server procedure finishes, results or exception data travels back, from server to client. By the way, marshalling is the process of taking a collection of data items (such as the procedure name and its arguments) and grouping them according to some predefined representation, suitable for transmission over the network. The server should know and conform to this representation in order to unmarshall the received data and recover the transmitted information.

      Albeit conceptually simple, there are some interesting (nasty) problems for implementing RPCs, such as passing pointer arguments (remember that client and server have different address spaces).

    4. Computer Networking: A stub network is a network or part of network with only one communication path to external networks (non-local hosts). For instance, if we connect to our Internet Service Provider using only one router, our local network is a stub network with respect to our provider.

    There is other related context for stubs: in electronics, we identify Stub sections, which are mostly used for impedance matching in transmission lines. But I'm not too familiar with this "stub" meaning.

As always, it was a pleasure to answer your questions. Thank you very much.



A Central Abstraction: The Process (I)

August 30th, 2008

Abstractions

I do strongly believe in abstraction being the root of computing (however, you may want to read Is abstraction the key to computing? as a motivation for a different perspective on the role of abstraction in computing). Modern hardware and software systems include a lot of features and perform so many tasks that it is impossible to understand, build and use them without recurring to abstractions. For instance, let's take a look at the CPU: it is the central part of a general purpose computing system, and is also an extremely complex system in itself. Functionally, a CPU is an instruction-crunching device: it processes one instruction after another, following the steps of fetch, decode, execute and writeback (in von Neumann architectures). In other words, the CPU retrieves the instruction from memory, decodes it, executes it, and put the results of the operation back into memory. Further, the CPU has no clue (and actually does not care) about the higher-level semantics of the instruction it may be executing at a specific time. For example, the CPU may be executing an instruction related to a spell-checking task, and a few instructions later it may be executing an instruction related to other task, say, MP3 playing. It only follows orders, and just execute the instruction it is told to execute.

Nowadays, computing systems are expected to do more tasks on behalf of its users. Several tasks must be performed concurrently. As in the previous example, the system might be running the spell-checker and the media player simultaneously. In multiprogrammed systems we can achieve pseudoparallelism by switching (multiplexing) the CPU among all the user's activities (true parallelism is only possible in multi-processor or multi-core systems). Remember that multiprogramming requires the CPU being allocated to each system's task for a period of time and deallocated when some condition is met.
Continue →



Retrieving system time: gettimeofday()

August 26th, 2008

Today, a friend of mine reported a problem with gettimeofday() under MinGW. It was a relatively common error: 'gettimeofday' undeclared (first use this function). Cause and solution of this problem is kind of easy, and we'll present it at the end of the post. However, what's that function gettimeofday()?

gettimeofday() is a function for retrieving system time in POSIX-compliant systems. Unlike the time() function, which has a resolution of 1 second, gettimeofday() has a higher resolution: microseconds. Specifically, the prototype of gettimeofday() is:

int gettimeofday (struct timeval *tp, struct timezone *tzp)

The function retrieves the current time expressed as seconds and microseconds since the Epoch, and stores it in the timeval structure pointed to by tp. The struct timeval has the following members:

long int tv_sec: Number of whole seconds of elapsed time.
long int tv_usec: The rest of the elapsed time (a fraction of a second), represented as the number of microseconds.

Thanks to the tv_usec member, we have a resolution of microseconds. It's also important to remember what the Epoch is. The Epoch is just an arbitrary starting date set by the system in order to compute time, i.e., it's a reference or base time. For instance, POSIX-compliant systems measure system time as the number of seconds elapsed since the start of the epoch at 1970-01-01 00:00:00 Z.

On its side, the struct timezone was used to return information about the time zone. However, using this parameter is obsolete (e.g., it has not been and will not be supported by libc or glibc). Therefore, tzp should be a null pointer, else the behavior may be unspecified (check your system's specifications).

gettimeofday() returns 0 for success, or -1 for fail. Simple. Further, this function should be available in sys/time.h. But my friend's installation of MinGW only included the following in sys/time.h:

 
#include <time.h>
 
#ifndef _TIMEVAL_DEFINED /* also in winsock[2].h */
#define _TIMEVAL_DEFINED
struct timeval {
  long tv_sec;
  long tv_usec;
};
#define timerisset(tvp)	 ((tvp)->tv_sec || (tvp)->tv_usec)
#define timercmp(tvp, uvp, cmp) \
	(((tvp)->tv_sec != (uvp)->tv_sec) ? \
	((tvp)->tv_sec cmp (uvp)->tv_sec) : \
	((tvp)->tv_usec cmp (uvp)->tv_usec))
#define timerclear(tvp)	 (tvp)->tv_sec = (tvp)->tv_usec = 0
#endif /* _TIMEVAL_DEFINED */
 

Continue →



Computer Science Questions

August 23rd, 2008

I have been receiving several questions in my email, and I've been doing my best to answer (I cannot afford to neglect my job... yet :-) ). I think that it would be more interesting to answer a few of these questions publicly, as I'm no expert and by answering in public others could point out my errors, and ultimately, we all would learn. Therefore, this post inaugurates a series to answer some computer science-related questions of the readers of this blog. You may send your questions to my email (jose at halcode.com). Of course, I can only answer those questions at my reach. Let's start with the following 3 questions, which I had received several days ago:

  1. Hi, Jose. Please, could you expand on the reasons why adding people to a late software project makes it later? by José Rodríguez (Venezuela).
  2. That statement is known as Brooks's Law, and it was coined by the renowned computer scientist and software engineer Frederick P. Brooks. Concretely, the original statement found in his 1975 classic The Mythical Man-Month is "adding manpower to a late software project makes it later". Basically, the idea is that adding more analysts, designers or programmers to a project running behind the original schedule will delay it even more.

    Broadly speaking, the rationale of Brooks's law is related to knowledge management. First, when new personnel is added to the project, some resources have to be diverted into training or informing the newcomers about the project's status, vision and philosophy. That will delay the project. Further, when the number of people participating in a project increases, so does the number of communication paths. Thereby, more resources (including time) are required in order to distribute the information. Regarding this point, you may be interested in reading my entry on "Knowledge Sharing" in Software Design, Trials and Errors.

  3. What is a Reentrant Routine? by Ricardo Orozco (Venezuela)
  4. A routine or procedure P is reentrant (or pure code) if it can be "re-entered" after it is already in execution. Basically, it means that P can be executed two or more times simultaneously, or alternatively, that P can be safely executed concurrently. There are some conditions P must follow in order to be reentrant, and you may check them in the Wikipedia entry for reentrant functions.

    Some programs necessarily have to be reentrant. For instance, device drivers. A device driver has to be reentrant because another interrupt may be raised while the driver is running. This means that reentrancy allows for code sharing. For example, if a program consists of 600 KB of code and 200 KB of data, and n users are simultaneously using the program, we would require n x 600 KB of physical memory for the code if the program is not reentrant. But if the code is reentrant we can share it among the n users, saving a lot of memory.

  5. How to characterize nasal consonants acoustically? by Rajeev Ranjan (India)
  6. A good answer to this question would require plenty of explanations. I suggest you to check on pages 487-514 of Acoustic Phonetics by Professor Kenneth Stevens. But I'll provide you with some hints, anyway.

    Nasal consonants are sonorant phonemes, but they exhibit significant losses due to the nasal tract coupling. Further, nasal spectra is relatively very stable during the oral tract closure (there are minimum acoustic alterations). Typically, F1 is located near to 250 Hz, F2 is weak, and F3 is near to 2 kHz. Remember that for these phonemes the acoustic energy also transits the nasal cavities. Such nasal cavities have different frequency properties. But the oral tract, albeit closed, also alters the acoustic transfer function. This transfer function, for simple phonemes such as vowels, includes only poles. However, when the oral tract is closed, the acoustic transfer function also includes zeros. And that changes the output in a great deal. The location of the first spectral zero of nasal consonants depends on the point of oral closure (for instance, the point of closure for /m/ is more anterior than /n/'s).

Thank you very much for your questions, and for reading. I encourage you to send more questions to my email account. So long.

Published in Computer Science  |  4 Comments


Introduction to Digital Signal Processing (I)

August 21st, 2008

Digital Signal Processing (DSP) comprises the techniques and algorithms for transforming, filtering and representing digital signals (DSP is a subfield of the more general Signal Processing topic).

Continuous and Digital Signal Processing

A signal is a measurement of a process, an observation of the behavior of some system. Numerically, a signal is a time-varying or spatial-varying quantity (in the following, for simplicity, we'll assume that the independent variable is time t). Some physical signals, such as speech and image are continuous in time. For instance, the speech signal is a continuously varying acoustic pressure wave. Sometimes, continuous signals x(t) are referred to as continuous or analog waveforms (continuous and analog are typically interchangeable terms albeit analog is a kind of absolute term, and we will not be using it in the following). Continuous signals vary at an uncountable infinite number of times. On its side, digital processing units can only handle sequences of numbers, i.e., they are discrete devices. In order to harness the benefits of digital processing units, continuous signals have to be first discretized (sampled). After sampling, we get a digital signal, which we might use as a representation of the original continuous signal. This sampling process is performed by a Continuous-to-Discrete (C/D) converter.

Summarizing, sampling is the process by which a digital representation of a continuous time signal is obtained. Basically, during sampling we select a finite number of data points (in a finite time interval) to represent the infinite amount of data that the continuous signal contains (within the same interval). If sampling is periodic, we sample x(t) at uniformly spaced time instants. Sampling is by no means a trivial issue, and we have to be careful in selecting the discrete data values... how well does this discrete sequence represent the continuous signal? (We'll study this question in an upcoming post).

Continue →



Software Design, Trials and Errors

August 7th, 2008

Recently, I read a succinct and instructive article by Professor Robert L. Glass, published in Communications of the ACM, Volume 51, Number 6 (2008). Professor Glass is a widely respected expert in the Software Engineering area, and his prose is always very eloquent and a pleasure to read. The specific article is Software Design and the Monkey's Brain, and it attempts to capture the nature of software design. By the way, if you enjoy that article, you may also like a recent book by Professor Glass: Software Creativity 2.0, in which he expands on the role of creativity in software engineering and computer programming in general.

Essentially, the article Software Design and the Monkey's Brain deals with two intertwined observations:

  1. Software Design is a sophisticated trial and error (iterative) activity.
  2. Such iterative process mostly occurs inside the mind (at the speed of thought).

In the following, I'll present my own appreciations on this topic. Regarding the first observation, I think that trial and error (I've also found the expression trial by error) is the underlying problem-solving approach of every software engineering methodology, like it or not. Alas, there is no algorithmic, perfectly formalized framework for creating software. In his classic book Object-Oriented Analysis and Design, Grady Booch says:

The amateur software engineer is always in search of magic, some sensational method or tool whose application promises to render software development trivial. It is the mark of the professional software engineer to know that no such panacea exists.

I totally agree. Nevertheless, some people dislike this reality. Referring to Software Engineering, a few (theorist) teachers of mine rejected calling it "Engineering". These people cannot live without "magic". Indeed, there are significant conceptual differences between software practitioners and some (stubborn) computer scientists, with regards to Software Engineering's nature. These scientists are not very fond of the trial and error approach. In his article, Professor Glass presents some past investigations which verified that designing software was a trial and error iterative process. He also reflects on the differences in professional perceptions:

This may not have been a terribly acceptable discovery to computer scientists who presumably had hoped for a more algorithmic or prescriptive approach to design, but to software designers in the trenches of practice, it rang a clear and credible bell.

I like to think of software construction as a synthesis process. Specifically, there are two general factors in tension: human factors and artificial factors. The former, mostly informal, the latter, mostly formal. From the conflict, software emerges. Let's remember that the synthesis solves the conflict between the parts by reconciling their commonalities, in order to form something new. It's the task of the software designer to conciliate the best of both worlds. Software designers have to evaluate different trade-offs between human and artificial factors.

As a problem-solving activity, software construction is solution-oriented: the ultimate goal of software is to provide a solution to some specific problem. Such solution is evaluated by means of a model of the solution domain. But before arriving to such solution domain model, we have to form the problem domain model. The problem domain model captures the aspects of reality that are relevant to the problem. Later, designers look for a solution, as told, by trial and error. Additionally, the resources available to the designer, including knowledge, are limited. More often than not, empiricism and experience lead the search for a solution. This has an important consequence: software construction is a non-optimal process; we rarely arrive to the best solution (and which is the best solution?).

On its side, knowledge acquisition is other interesting process. During the entire cycle of development, designers have access to an incomplete knowledge. Gradually, designers learn those concepts pertinent to the problem domain model. And, when we are building the problem domain model, it often occurs that the client perspective of the problem changes, and we have to adjust to the new requirements. Interestingly enough, knowledge acquisition is a nonlinear process. Sometimes, a new piece of information may invalidate all our designs, and we must be prepared to start over.

Continue →

Published in Software Engineering  |  7 Comments


Hints at Speech Inverse Filtering of Fricative Phonemes

July 28th, 2008

Is it possible to invert fricatives by using Childers' Toolboxes?

In first instance, I think that the answer is that you can't. IIRC, Childers' toolbox allowed for inversion of the sentence "we were away a year ago". But that's a very convenient sentence to invert, because most of its relevant acoustic information can be clearly seen with a formants analysis. Nevertheless, that's not the case for fricatives (and nasals, for instance, have other interesting problems too).

For my thesis, I developed my own inversion toolbox. But no matter the toolbox, you require a "source" of information for inversion. That information may be spectral energy distribution, formants, etc. For fricatives, formants are out-of-question. Fricatives' spectrum differs importantly from voiced phonemes', as you know. When we utter fricatives, the oral tract naturally adopts a specific "constriction" configuration... and such configuration would yield a formantic structure. The problem is that turbulence generated in the oral tract hides resonances, and that's why formant tracking is misleading in such cases.

Continue →

Published in Speech Technologies  |  No Comments


Resources for Articulatory Synthesis Research

June 10th, 2008

Recently I've been asked for some leadings to more resources about Articulatory Speech Synthesis. The articulatory approach is a very captivating research topic, but it's relatively hard, and is based on a hefty amount of multidisciplinary documents and results. Some involved papers and books are somewhat old or difficult to find. This is my list of selected resources:

Books

Continue →

Published in Speech Technologies  |  1 Comment


coLinux, int 80 on Windows and other rants

June 2nd, 2008

Recently, a reader sent me an email exposing some problems he faced when trying to assemble on Cygwin a code originally targeted at Linux. The problem, as he stated, was that int 0x80 didn't perform as expected. Well, plenty of explanations are pertinent:

Cygwin

Cygwin allows to run a collection of Unix tools on Windows, including the GNU development toolchain. However, at its core, cygwin is a library which translates the POSIX system call API into the pertinent Win32 system calls (system calls are often abbreviated as syscalls). Therefore, cygwin is a software layer between applications using POSIX system calls and the Win32 operating systems, which allows porting some Unix applications to Windows. This way you can, for instance, have the Apache daemon working as a Windows service. Other very attractive feature of Cygwin is its interactive environment: you can run your shell quite nicely, and run your Autoconf scripts, for example. However, porting means recompiling. There is no binary compatibility, and your program cannot run in computers without Cygwin (without CYGWIN1.DLL, more precisely). Furthermore, albeit some progress has been made, Cygwin is relatively slow (it's a POSIX compatibility layer, after all.) If possible, I prefer to recompile my applications directly with MinGW. For me, this allows for a faster development cycle. Note, though, that Cygwin can compile MinGW-compatible executables. It's just that, as aforesaid, I prefer to work with MinGW directly. I only work on Windows if I have to develop applications for Windows. But Linux's development tools are the best, and we can access several of them by using MinGW. I think that Cygwin is best suited for general cross-development and for handling complicated software porting.

System Calls and int 0x80

A system call is a request by an active process for a service performed by the operating system kernel. Remember that a process is an executing (running) instance of a program, and the active process is the process currently using the CPU. The active process may perform a system call to request creation of other process, for instance. Or perhaps the process needs to communicate with a peripheral device. In Linux on x86, int 0x80 is the assembly language instruction that is used to invoke system calls. int 0x80 is a software interrupt, as it will be raised by a software process, not by hardware devices. Before invoking such interruption, our program have to store the system call number (which allows the operating system to know what service your program is specifically requesting ) in the proper register of the CPU. Every interrupt is a signal to the operating system, notifying it about the occurrence of an event that must be computationally handled.

Continue →