Archive for the ‘Drexel’ Category

Superscalar — The Easy Way!

Wednesday, April 14th, 2010

I really enjoy taking courses with useful, practical content. It doesn’t happen all the time, but I’ve had a good run of luck recently. Tuesday’s lecture in the ECEC622 Parallel Computer Architecture course I’m taking was about OpenMP — a very easy-to-use, open-source API for C/C++ and Fortran.

OpenMP is what an API should be — both useful and very easy to incorporate into an application, even for programmers encountering it for the first time. Within a few minutes of learning the fundamentals, I was able to write a multithreaded version of the ubiquitous “Hello, World!” program.

Of course, whenever I want to really start to learn a language, architecture, or API, I use it to write a Mandelbrot Set program. Mandelbrot Set calculation lends itself extremely well to parallelization APIs like OpenMP — it’s what programmers refer to as “embarrassingly parallel.”

Here is the Mandelbrot Set calculation code. The OpenMP modifications are shown in blue. Omit them, and the code does exactly the same thing, but without the parallelization.

#include <stdio.h>
#include <omp.h>

//Mandelbrot Set calculation routine, to test
//speedup obtained from using OpenMP

//M. Eric Carr
//mec(eighty-two) .at. drexel (dot...) edu

int main(){
	const double rmin = -2.2;
	const double rmax = 1.4;
	const double imin = -1.8;
	const double imax = 1.8;
	const unsigned long long maxiter = 20000;
	const unsigned long long xres = 2000;
	const unsigned long long yres = 2000;

	double a, b, r, i, h;	//Private variables for threads

	unsigned long long totalcount=0;
	unsigned long long count=0;
	unsigned long long x,y;
	unsigned long long iter;

	double dx, dy;
	dx = (rmax-rmin)/xres;
	dy = (imax-imin)/yres;

	#pragma omp parallel for private(a,b,r,i,h,x,y,iter) reduction(+:count)
	for(y=0;y<yres;y++){
		b = imax-y*dy;
		for(x=0;x<xres;x++){
			r=0;
			i=0;
			a = rmin + x*dx;
			iter=0;
			while(iter<maxiter && r*r+i*i<=4.0){
				h=(r+i)*(r-i)+a;
				i=2*r*i+b;
				r=h;
				iter++;
				}
			if(iter>=maxiter-1){
				count++;}
			} //for x
		} //for y

	#pragma omp barrier

	printf("dx is: %F\n",dx);
	printf("dy is: %F\n",dy);
	printf("Total count is: %lld\n",count);

	dx=count*dx*dy;

	printf("Total area is: %F\n",dx);

	return(0);

	} //main

Three extra lines of code, to share the workload among however many CPU cores your system has (eight virtual cores, on a Core i7 CPU). Talk about a good return on your coding time!

True 8-bit computing!

Friday, April 2nd, 2010

A while ago, I realized that the DrACo/Z80 was actually quite a bit more complex than it needed to be, to suit the purposes of the EET325 class. Since it is programmed in machine code, the programs written for it tend to be very small, both in terms of code size and memory usage.

Since wire-wrapping all of the connections is by far the most time-consuming part of the build process, this suggested a possible shortcut. Instead of wiring up all sixteen address lines, why not make it a true 8-bit computer — with only eight address lines? Sure, it will only be able to access the first 256 bytes of its memory, but nobody ever uses more than that in the class, anyway. (…and if any students get that ambitious, it can still be upgraded to 16-bit easily enough.)

Here are the updated schematics (including a few bug fixes and annotations). The 74LS245s between the Z80 and the bus have been removed, as well, since even the original 16-bit prototype runs well enough without them.

DrACo/Z80 Control Panel (8-bit version)

DrACo/Z80 Core (8-bit version)

Offline Password Reset

Friday, February 26th, 2010

It’s fun to get paid to hack.

We got a new laptop at work that had just arrived from the Provost’s office. They did a good job setting it up, but somehow in the shuffle, we weren’t told what the administrator password was.

After trying the usual suspects, we resorted to Plan B — a bootable, highly-customized Linux CD specializing in Windows password removal. In less than five minutes (three of which were spent finding a USB CD drive to plug in), the admin password was gone. (Security? We don’t need no steenkin’ security…)

http://pogostick.net/~pnh/ntpasswd/

(Use it with caution; if any encrypted files are on the machine, resetting the password will make them unreadable.)

Enjoy — and remember, kids — be a Jedi, not a Sith.

cp /dev/drillpress /dev/sda1

Thursday, February 18th, 2010

Paleotech Formatting. When it absolutely, positively has to be made unrecoverable…

A hard drive used by one of our financial gurus at work died. A recovery service was able to get most or all of the data out, and I was asked to handle the warranty claim for a new hard drive. We had purchased the “no-HD-return” option on the warranty, and Dell did a nice job getting us a good replacement drive the next day — but we still needed to make sure the old drive couldn’t be read by nefarious evildoers (should they decide to go Dumpster-diving and have a cleanroom).

I vaguely remembered reading that one suggested Best Practice for such a situation involved a drill press — which we just so happened to have available in the lab.

I dare say the data has done left the building.

Well, this isn’t encouraging…

Monday, October 26th, 2009

Just stopped by our IT department at work; our sysadmin guru was installing Windows 7 on a test PC (one of our standard configurations — a Dell GX620.) According to him, it had already bluescreened once on him. While I was there, he was running into a bunch of other errors — during the install process. Admittedly, he was trying to upgrade an existing XP installation, but this is still not a good sign.

Very often, Windows installation errors are due to user error; I’ve seen this happen many times. But this guy knows his stuff — if he made a mistake, it’s because it’s there waiting to be made.

Looks like XP or Linux for the foreseeable future, for me…

Stupid Robot Tricks

Wednesday, February 11th, 2009

New video — getting two robots to “share” access to an object. (Robot B waits for the signal from Robot A before starting, then Robot A waits its turn for the cycle to start over.) Since the robots aren’t bolted down, the positions drift — the solution for this is to either secure them or run them at a slower speed. The test works as a proof of concept and test of the communications between the two, though.

Energy Star — Paleotechnology style!

Monday, December 15th, 2008

A while back, I realized that the Z80 was drawing (relatively speaking) a lot of power. I know modern CPUs can both crunch your numbers and cook your dinner; the Intel Core i7 datasheet specifies a maximum of 145 AMPS of current. (I don’t think my car’s starter motor draws that much, some days.)

The Z80, though, being from the ancient glory days of yore when CPUs didn’t even require a heatsink, let alone sophisticated cryogenics, didn’t really strike me as a power hog. In fact, the version the DrACo/Z80 uses is CMOS-based (for static clockability) — it couldn’t be drawing some 700 milliamps of power all on its own, could it?

No, as it turns out. The Z80 itself is quite efficient. The 74LS245 buffer chips, on the other hand, draw 40 or 50 mA apiece, even when doing absolutely nothing. They just sit there and get warm! “Low-power Schottky,” my paleotechnological posterior!

A quick look online turned up the drop-in replacement 74HCT245 version, which is much more power-friendly. (These only draw a few microamps when idle.) The results speak for themselves…

Much more Earth-friendly! (…and now the computer can be run from a USB port or from NiMH batteries. Whether the Department of Security Theater would let me on a plane or on Amtrak with it is most likely another story, though.)

I’m also experimenting with removing the ‘245 chips connecting the Z80 to the bus. It works well enough to do the Prime Number program, but may not be as stable for high-speed operation. More on this later (time permitting).

Prime Time

Sunday, December 14th, 2008

…and now, for something completely different. (By different, I mean almost useful!)

The Z80 computer is now busy computing prime numbers! I wrote a straightforward BASIC program, which I then compiled using Oshonsoft’s Z80 simulator suite of utilities. The resulting assembly program was pretty good, but needed some optimization. (BASIC isn’t really geared towards byte manipulations.) I replaced the crude BASIC mod-and-div output routines I had written with four LD and four OUT instructions. Here is the (probably still really inefficient) assembly code.

Some technical progress, too: Loading complex programs is becoming progressively easier with the addition of a hex editor to my toolkit. Instead of toggling code in a byte at a time from the control panel, I can now load it into a serial EEPROM and plug it into the Virtual ROM peripheral. With a few lines of code (which I also copied to the ROM because I’m lazy), this code can be copied into base memory automagically. All that needs to be toggled in to start is a single, 3-byte JUMP command.

For instance, to bootload the prime-number program into base memory:
* Enter Program mode
* Connect the ROM
* Start the clock running
* Load the following instruction into memory:
0000 C3
0001 00
0002 86
* Hold down RESET, exit Program mode, and release RESET.
– The Z80 will jump to 0×8600, which is a quick routine to load the main primes program (8300:854B) into base memory (0000:034B).
– At the end of this routine, a JUMP command will cause it to halt.
* Perform a RESET to start the prime-number program running.

(I actually started loading the bootloader into base memory — but then realized that it would quickly overwrite itself. A workaround, if you don’t have a bootloader written into ROM, would be to toggle in the bootloader at a higher address than you plan to use for the program, then JUMP to that address — for instance, 0×7000h — from 0×0000h.)

I’m using my ancient HP 1630A logic analyzer to look at the output from the program (POD2 connected to the data lines, L clock connected to ~IORQ); it’s calculated the primes through 113 so far.

I also managed to complete Problem 001 from Project Euler (“What is the sum of all numbers less than 1,000 which are multiplies of three or five?”), using the Z80.

I also realized this weekend that my new Gigabyte GA-EX58 Extreme motherboard doesn’t have any RS232 ports. Such is “progress,” I guess. *sigh* Time for an add-in card. (Do they make PCIe RS232 cards??)

…and then there were two. Correction, four.

Monday, December 8th, 2008

The EET325 class is wrapping up this week, but not before I got Bill’s prototype Z80 core working with the original protoboard control panel. Not only that, but two of the students (Mike and Austin) have finished, working Z80s. So now there are four of them! Can a Beowulf cluster of Z80s be far behind?

Single-board at last

Tuesday, November 11th, 2008

To paraphrase Emperor Palpatine…

“And now, witness the power of this fully ARMED and OPERATIONAL … Z80 computer!”

OK, so running at the current 40kHz system clock, it’s got more milliamps than megaflops — but at least it’s running properly on one board (CPU, memory, controls, and LEDs all together.)

More about the milliamps to come: it’s currently drawing over half an amp(!) at 5V — almost all of which is going into keeping the 74LS245s warm. The parts should be in tomorrow to give it a quick “Energy Star” upgrade.

Click the image for a larger version. (The dials on the right are, from top to bottom, address-high-byte, address-low-byte, and data.)