Application Optimization Design of Cold Fire On-Chip SRAM under Linux


Taking the MP3 decoder as an example, this paper introduces an application scheme for configuring the on-chip SRAM in the embedded Linux system, which effectively improves the decoding efficiency of the code and reduces the execution power consumption. This solution has been greatly improved in both performance and cost.

This article refers to the address: http://


1 Hardware platform and software architecture The hardware platform uses Freescale's MCF5329EVB development board. The terminal hardware includes ColdFire5329 processor, 32 KB on-chip SRAM, 1 800×600 matrix LCD display, 9×3 array matrix keyboard, I2S audio decoding chip, 64 MB SDRAM, 10/100M Ethernet interface, and 3 UART interface. The software architecture is shown in Figure 1. It mainly includes modules such as MP3 decoder, audio driver, keyboard driver and user graphical interface (GUI). Use μClinux as the operating system. μClinux has greatly simplified and modified the features of embedded applications, supports multiple file systems and multi-tasking, and has a relatively complete network system protocol, which is especially suitable for embedded applications.

2 MP3 decoding algorithm analysis This paper selects MP3 decoding program as the program verification code. MPEG-1/2 Audio Layer 3 is a lossy compression algorithm designed specifically for music and voice data. The decoding process of this algorithm is more complicated, including functional modules such as inverse modified discrete cosine transform (IMDCT), inverse quantization, Huffman decoding, and subband synthesis. After reading a piece of MP3 data, firstly, the synchronization word in the data stream is detected to determine the start of one frame of data; then the frame header information is extracted, in particular, some parameters required for decoding, and the frame side information and the main data are separated; After decoding the side information data to obtain Huffman decoding information and inverse quantization information, and then performing reordering, stereo processing, anti-aliasing processing, IMDCT conversion, and subband integrated filter bank, the PCM output can be obtained.
The MP3 decoding process is shown in Figure 2. It is roughly divided into two phases, namely the data flow control phase and the numerical calculation phase. The data flow control phase includes processes such as frame synchronization, sideband information decoding, and Huffman decompression. Among them, Huffman decompression is to operate on the encoded data, and other processes are to operate on the frame control part.

3 Optimized design based on on-chip SRAM
3.1 Solution Analysis SRAM instructions execute much faster than DRAM. The Cold-Fire5329 processor integrates 32 KB of SRAM internally. This design will take advantage of the on-chip SRAM of the processor to optimize the decoding process. First analyze the main decoding functions in the source code, as listed in Table 1. It can be seen that the drive write function (write), subband synthesis (MPEGSUB_synthesis), inverse modified discrete cosine transform (imdct_I) and fast discrete cosine transform (fast_dct) consume a large amount of processor resources, occupying almost 80% of the decoding time. According to the analysis result, the audio driver and the above decoding function are respectively put into the SRAM for execution, so as to improve the execution speed of the streaming media decoder and reduce the consumption of the processor resources.

3.2 Configuring the audio driver to execute in the on-chip SRAM The Linux operating system divides the kernel and the application running on it into two management levels, which are often referred to as "kernel state" and "user mode". The kernel state has high application permissions and can control the mapping and allocation of processor memory. The audio driver is an important part of the system kernel. It works in the kernel mode, and it can continuously read audio information from the user space decoding file, and drive the audio chip to play sound and other related functions. By modifying the μClinux-2.6 kernel code, the audio driver can be configured to be executed in on-chip SRAM, mainly by modifying the system link file. The system link file is used to merge the input files into an output file according to certain rules, and bind the symbols to the addresses.
In order to modify the kernel code without affecting the normal operation of other files in the system, a new segment definition (.sramcode) is added to the kernel link script, and the link load address of the segment area is specified as the processor on-chip SRAM. The sramcode segment defines a code segment (.sramtext) and a data segment (.sramdata) for storing code and data in the driver. The alignment is ALIGN(4), because for 32-bit microprocessors, this alignment will effectively reduce processor execution cycles and improve execution efficiency. Then, use the two pointers _lsramcode and _lsramcodeend to point to the beginning and end of the segment of the sramcode segment. The specific implementation is as follows:


After the modification of the operating system link file is completed, the macro definition is used to link the relevant function and data to the sramcode code segment and the data segment in the audio driver, and the related function is copied into the SRAM by the copy function. After compiling and linking, you can store the mapping file Sys-tem in the system kernel. View the drive function and the address of the data in memory in the map. Figure 3 shows the mapped address of the audio driver function in the on-chip SRAM of the processor.

3.3 Configuring real-time data and functions to perform in-chip SRAM The real-time data and functions of the user space are placed into the on-chip SRAM for execution. Since the processor can directly access data and instructions from the on-chip SRAM, the processor is reduced. The cycle of accessing data and instructions increases the efficiency of program execution. First, place real-time data into the processor's on-chip SRAM. It is implemented by S_malloc and S_free functions: S_malloc is used to apply for processor memory space, and S_free is used to release space for this application. In order to use the defined S_malloc and S_free functions flexibly, you need to define a structure body and an address pointer:


Then, the real-time data in the MP3 decoding program can be put into the processor memory by the dynamic memory allocation method. Loading a function into SRAM is not the same as loading real-time data. It needs to be implemented by pointers and enumeration variables. First set each function to a size of 4 KB through a macro definition, and use the enumeration variable to assign the function the start address of the processor's on-chip SRAM execution.

SRAMFUNC2=SRAM_BIG_FUNC1+BIG_FUNC_SIZE,...};
After defining the storage address loaded when the function is running, the functions such as MPEGSUB_synthesis and imdct_1 in the MP3 decoding program are copied to the on-chip SRAM by means of string copying. After compiling and linking these functions will be executed. Loaded into the corresponding SRAM cell block. This reduces the time required for the processor to execute the decoding function and improves the execution efficiency of the program.


4 Performance Test and Analysis In order to verify the optimized design of the on-chip SRAM based on the processor, we verified and tested the MP3 decoder optimized by this scheme on the MCF5329EVB development board.
First, perform a functional test using the test stream recommended by the MPEG organization (128 kb/s, 44.1 kHz). Select a piece of audio test. Mp3, using the standard floating-point decoder and the audio decoder designed in this paper to perform local decoding test, and compare and analyze the decoded waveform. It can be seen from the waveform comparison of Fig. 4 that the decoded waveform of the decoder optimized by this scheme is basically the same as the standard floating point decoder. After the human ear test, the difference between the decoding output of the two cannot be discerned. Therefore, from the functional point of view, the on-chip SRAM-based application optimization scheme designed in this paper is feasible.

Second, perform performance tests. Compare and analyze the MIPS consumption and space consumption of the decoder before and after optimization on the test platform, as listed in Table 2.

Before optimization, the decoder MIPS consumption is 68 MIPS@240MHz; after optimization, the decoder MIPS consumption is 39.2 MIPS@240 MHz. In the case of hardware conditions, although the memory consumption has increased, but after the optimization of this scheme, the decoding efficiency has been greatly improved.

Conclusion This paper presents an application optimization design based on the on-chip SRAM of the embedded Linux operating system. Taking the MP3 decoder as an example, the decoder is optimized by configuring the audio driver, real-time data and functions to execute in the on-chip SRAM of the processor, and successfully implemented on the ColdFire5329 development platform. The optimized MP3 player not only has high decoding efficiency, but also has good sound quality, and can realize real-time playback on low-end and mid-range processors, making it possible for low-performance CPUs to process complex applications. The scheme effectively improves the execution efficiency of the application and reduces the power consumption, and has important reference value for the development of embedded Linux application products.

3.5mm Wire To Board Connectors

3.5mm Wire To Board Connectors.Standards
Wire-to-board connectors carry approvals from various national and international organizations. In North America, they often bear marks from Underwriters Laboratories (UL) and/or the Canadian Standards Association (CSA).
A wire to board connector for the European marketplace should comply with the Restriction of Hazardous Substances (RoHS) and Waste Electrical and Electronic Equipment (WEEE) directives from the European Union (EU). Wire-to-board connectors that comply with other requirements are also available.
BS 9526 N0001 - Specification for multi-contact edge socket electrical connectors.

3.5mm Wire To Board Connectors

ShenZhen Antenk Electronics Co,Ltd , https://www.antenkconn.com