Mapping a Nano-Programmable Logic Array
Michael James Wilson
Implementation of Computation Laboratory,
Computer Science Department, California Institute of Technology,
Miniaturization of computing systems leads to speed gains and the ability to manufacture smaller devices. As the limits of miniaturization on conventional silicon manufacturing are approached it becomes necessary to look for other ways to make computing systems smaller. Current technology allows the production of carbon nanotubes and silicon nanowires which are a few nanometers in diameter. An array-based architecture has been proposed to organize nanoscale wires into the basic building blocks for a computing system. To perform complex tasks, higher level circuits such as Programmable Logic Arrays (PLAs) should be mappable to this architecture. This project explores generating PLAs for the nanoscale architecture. Experimenting on standard benchmarks and constructing examples yields information on the best, worst, and average-case characteristics of these circuits. This information is important for determining the types of nanoscale arrays that must be constructed to implement various types of logic.
Programmable logic devices are now mostly produced through lithography. Circuits are etched onto a silicon chip. Advances in technology have resulted in impressive degrees of miniaturization with these methods, but we may soon reach the bottom limit of size with lithographic techniques. New bottom-up synthesis techniques may allow us to build complex computing devices with features that are a few atoms across.
There is a proposed architecture for organizing arrays of silicon nanowires into a computational system[1], and methods for addressing nanoscale wires with more conventionally-sized microscale wires[2]. A natural extension of this work is mapping logic components to the architecture. A programmable logic array (PLA) is a circuit which can be programmed to perform arbitrary boolean logic. Working at the nanoscale presents challenges such as faulty wires and random assembly of arrays. A programmable circuit can work around many instances of these problems. PLAs are also well-studied in conventional architectures, and a good deal of benchmarks and data for comparison exist. These factors suggest PLAs as a good starting point for mapping logic to nanoscale arrays.
Nanoscale architecture
The nanoscale architecture1 supports building OR planes with inverting and buffering planes at right angles to these OR planes. This allows for a single buffered NOR. Connecting several of these planes together produces cascaded NOR gates which may be used to implement PLAs. It has been shown how four NOR planes can be used to produce one type of PLA[3], and this idea extends to arbitrarily many planes.
Figure
1: A simple
nanoscale PLA
Other analysis has shown that building nanoscale PLAs with 60 output terms yields the least size consumed per output term.
Mapping
Synthesis and mapping basically generated an arbitrary number of nand planes like the nand planes found in Whirlpool PLAs3 and without optimization beyond that provided by CAD tools such as SIS[4]. Since two NAND planes are essentially equivalent to two NOR planes this mapping extends to the nanoscale architecture outlined above. To accommodate multiple planes on a single PLA, the planes are “wrapped” into a spiral, as show in Figure 2:

Figure 2:
Simplified example of a “wrapped” design. On the left, we have two planes and no
wraps. On the right, we have eight
planes but only six are used; thus the design is “wrapped” three times.
The mappings presented here followed these conventions: all inputs come into the first plane, all outputs leave through the last plane, and the PLA outputs do not feed back into the PLA. Allowing inputs and outputs to enter and leave at any point in the PLA allows for more optimization, and feedback is important in finite state machines, but the details of supporting these features have yet to be worked out. To account for these conditions, any inputs that were first used by a plane later than the first plane were buffered through all the intermediate planes, and outputs from planes earlier than the last plane were likewise buffered through to the end. Each buffer consumed an input signal and an output signal in the buffering plane. For the finite state machines, only one loop through the machine was considered (i.e., no feedback signals).
Using more NOR planes should give better results for many designs, because each NOR plane in this context corresponds to a level of logic. In a circuit such as an xor, there is a certain level of logic where the least number of terms are needed. Using less levels of logic (NOR planes) results in exponential growth in the number of terms used by the circuit. Thus, “wrapping” the design should give better results.
Selections from three types of benchmarks were mapped: datapath examples composed of adders, xor circuits, and other basic building blocks; small finite state machines from the IWLS93 benchmark suite[5]; and PLA book examples which came with the logic minimization tool espresso[6].
Results
|
Design |
Size |
Number of
Wraps |
|
add |
3 |
1 |
|
add |
4 |
4 |
|
add |
8 |
8 |
|
alu181 |
2 |
2 |
|
enable_counter |
9 |
1 |
|
enable_counter |
15 |
2 |
|
multiply |
3x3 |
1 |
|
register_file |
8x1 |
1 |
|
register_file |
3x4 |
1 |
|
shiftreg |
60 |
1 |
|
xor |
6 |
1 |
|
xor |
30 |
5 |
Figure 3: Datapath results.
Refer to figure 3. Note that for several designs (add, enable counter, register file, xor) it was possible to fit larger designs in the 60 output array if the design was wrapped multiple times. For instance, wrapping the add design 8 times allowed an 8-bit adder to fit, whereas a 3-bit adder was the largest adder that would fit without any wrapping.
|
Design |
Inputs |
Outputs |
States |
|
tbk |
6 |
3 |
32 |
|
pma |
8 |
8 |
24 |
|
s1 |
8 |
6 |
20 |
|
ex1 |
9 |
19 |
20 |
|
keyb |
7 |
2 |
19 |
|
s420 |
19 |
2 |
18 |
|
s208 |
11 |
2 |
18 |
|
sse |
7 |
7 |
16 |
|
kirkman |
12 |
6 |
16 |
|
bbase |
7 |
7 |
16 |
|
cse |
7 |
7 |
16 |
|
dk512 |
1 |
3 |
15 |
|
mark1 |
5 |
16 |
15 |
|
ex4 |
6 |
9 |
14 |
|
s386 |
7 |
7 |
13 |
|
bbara |
4 |
2 |
10 |
|
opus |
5 |
6 |
10 |
|
dk17 |
2 |
3 |
8 |
|
shiftreg |
1 |
1 |
8 |
|
ex6 |
5 |
8 |
8 |
|
dk14 |
3 |
5 |
7 |
|
beecount |
3 |
4 |
7 |
|
dk27 |
1 |
2 |
7 |
|
s27 |
4 |
1 |
6 |
|
bbtas |
2 |
2 |
6 |
|
mc |
3 |
5 |
4 |
|
lion |
2 |
1 |
4 |
|
dk15 |
3 |
5 |
4 |
|
tav |
4 |
4 |
4 |
Figure 4: FSM results.
All the IWLS93 finite state machines5 with less than 30 states were mapped. Multiple encodings were used, and the encoding yielding the best result for each design was kept. Refer to figure 4 for examples of the types of finite state machines that were mappable to a 60 output array. The PLA benchmarks produced similar results for logic densities.
These results could be improved through further optimization, but they give baseline numbers for what fits in a single 60 output array.
Conclusions
The data in this report suggests that small PLA designs can be implemented on nanoscale arrays. “Wrapping” PLA designs yields an improvement in the number of signals consumed in many designs.
Suggestions for further research
To further increase the efficiency of the mapping, more optimization methods such as Doppio-Espresso3 could be used. Adjustment of parameters in the optimization process could yield slightly better results as well.
Since the optimal size of the nanoscale arrays is so small (60 output terms), some circuits will simply not fit on a single array. It will be important to investigate using nanoscale PLAs as blocks in larger circuits and develop block-level placement and routing techniques.
Finally, as nanoscale technologies evolve and change new structures will become feasible that were impossible to build before. Taking advantage of the advances in this field will be important in designing the most efficient mappings and arrays.
Methods
File formats. All input files
and benchmarks were in espresso format.
Output files were in a propriety “nand format” which is very similar to
an espresso file, differing in that it has only one plane and the operation
performed in the plane is nand as opposed to product or sum. For verification and as intermediate files,
blif files were generated.
Optimization and nand plane generation. The nand planes
were generated as follows. The input was
run through espresso. Then the optimized
input was run through SIS with the following commands: “remove_latches; collapse; source
script.rugged; reduce_depth -d <depth>; write_blif” where <depth>
is the target depth (2 times the number of levels in the final network). This generates a blif file, which is then
parsed and broken into each individual level of logic, and then each level is
written out as a nand plane. Blif files
were then generated from the output nand files and the initial input file, and
MVSIS was used to verify their equivalence.
[1]. DeHon, A. Array-Based Architecture for FET-based, Nanoscale
Electronics. IEEE Transactions on Nanotechnology, 2(1), 23-32 (March
2003)
[2]. DeHon, A., Lincoln, P., Savage, J. E. Stochastic Assembly of
Sublithographic Nanoscale Interfaces. IEEE Transactions on Nanotechnology,
2(3), 165-174 (September 2003)
[3]. Mo, F., Brayton, R. K., Whirlpool PLAs: A Regular Logic Structure and Their
Synthesis. IEEE/ACM International
Conference on ICCAD 2002, 543-550 (2002)
[4]. Sentovich, E. M., Singh, K. J., Lavagno, L., Moon, C., Murgai, R.,
Saldanha, A., Savoj, H., Stephan, P., Brayton, R. K., Sangiovanni-Vincentelli,
A. SIS: A System for Sequential Circuit Synthesis, UCB/ERL M92/41, University
of California, Berkeley (May 1992)
[5]. McElvain, K., IWLS’93 Benchmark Set: Version 4.0,
<http://www.cbl.ncsu.edu/pub/Benchmark_dirs/LGSynth93/doc/iwls93.ps> (May
1993)
[6]. <ftp://ic.eecs.berkeley.edu/pub/Espresso/espresso-book-examples.tar.gz>
Acknowledgements I
thank André DeHon for mentorship. I also
thank Rafi Rubin and Michael Wrighton for technical assistance.
Competing interests statement The author declares that he has no competing financial interests.
Correspondence and requests for materials should
be addressed to M.J.W. (e-mail: mwilson@caltech.edu).