Embedded Linux System
Description and why
I like to dabble in embedded systems, this is my bread and butter. During college I was able to take an "Independent Study" course, which lets you do any project you want in replacement of a normal class. Since I did projects on my own anyways, this was awesome. At the time I really wanted to do a Linux capable board but at the time found most information on this to be targeting people whom are already experienced in this, or in other words, very little material for someone just getting started.
This projects therefore served three purposes; first to make a Linux capable board because that's just too awesome, secondly to create a guide (and reference for myself) for someone who was exactly in my situation in terms of knowledge, and thirdly to get those three credits. This was done using hardware available to me as a hobbyist (Broadcom scoffs at emails asking for pricing in small volume from mere mortals like myself) in 2015, and limitations of OSH Park's 4 layer service to make PCB's.
Please note, this writeup is still in progress! Typos and wierd grammar galore (yes, even many many years later as this gets "updated").
Please check out the git repository version of this which will show you all the extra files used to make this happen, for example .txt
based docs meant for myself and schematics and gerbers.
Structure
This is divided into two designs, one design being Atmel's AT91SAM9N12 which is the currently working board, and the other being Freescale's I.MX223. Each project directory contains schematics, board files, potentially necessary patches, and various helper scripts.
Also, here are some pictures of the designs in progress.
Thanks Henrik!
For those who don't know, Henrik made a board based on Atmel's SAM9N12 SoC (which I heavily used as a reference) roughly a year ago. This is what showed me as well as likely hundreds of others that such a thing is possible to do at home without thousands of dollars in equipment. His fantastic walkthrough is what inspired for me and guided me through nights of confusion and bewilderment, and if it weren't for him and his walkthrough this would have not been possible. So, Henrik, thank you!
Propagation
Atmel SAM9N12 Embedded Linux System
Description
The SAM9N12 is a low cost MPU from Atmel (now Microchip) that is capable of running linux. It's based on a 400 MHz ARM926EJ-S core from ARM (very old, a far cry from what you can get in even the cheapest of phones or tablets) and has a 32 bit EBI (External Bus Interface) to work with both SDRAM and NAND Flash at the same time. It can also use SPI based FLASH and USB OTG letting you use a USB Flash drive as another data store.
This board has:
- 64 MB of DDR2-SDRAM (W9751G6KB-25-ND)
- 4 MB of Dataflash (AT45DB321E-SHF-B)
- USB OTG and USB Device broken out (both only up to 12 Mbps Full Speed)
Originally this project was done years ago relying on Atmel forked branches of U-Boot and the Linux kernel from Atmel. While this did work, it meant having to use old non mainlined versions of each project. Later this project was restarted with the intention of using mainline sources for everything and BuildRoot to streamline the linux and root filesystem generation. The old version of this documentation, dubbed "Old School", can be found here, and shows how to do everything manually. This includes compiling the Linux kernel by hand to generate a zImage, creating a minimal root file system, and even using flash drive for the root file system including a native GCC toolchain!
This documentation will show the process of bringing a board like this up from start to finish, including describing the intended boot flow, how to configure all the software (AT91 Bootstrap, Kernel, root file system, BuildRoot, etc), show various bugs and issues found in the process (USB OTG was a fun one), how much space various functionality uses, and how to build a semi useful root file system (networking, stress, htop, tmux, etc).
The end goal is to have a system that can do the following:
- Boot to a shell and be able to communicate with it over serial
- Networking support
- Use the RNX-N150HG USB WiFi dongle to talk to the outside world
- Read only file system with compression (SquashFS)
- Run various tools (htop, stress, tmux, ping)
- If possible, an ssh server and TCC to compile a small C based demo program
Status
Everything seems to work except NAND flash, likely due to soldering issues with the SAM9N12 BGA based package. Because NAND flash isn't working, it was decided to try and stick everything in the 32 Megabit Data flash (yep, only 4 Megabytes for a boot loader, kernel, and root file system). USB OTG is used for a Wifi dongle so we get networking support to make this more interesting.
Boot Flow
The AT91SAM9N12 has a boot loader in ROM which can boot from NAND Flash, SPI Flash, SPI Data flash, and can even boot directly into the Linux kernel (therefore not needing U-Boot). There has been some reverse engineering work done on the boot loader here.
The memory setup we will be using is as follows:
AT45DB321E-SHF-B 32 Mbit -> 4,325,376 Byte or 0x__42_0000 (8192 Pages * 528 Bytes)
W9751G6KB-25-ND 512 MBit -> 67,108,864 Byte or 0x_400_0000
Flash Item DRAM
0x000000 <MAX: 0x0026D8> AT91 bootstrap -> Not Copied
0x002800 <MAX: 0x004B00> Device Tree -> 0x2100_0000
0x007300 <MAX: 0x1D8D00 or 1,936,640B> zImage -> 0x2200_0000
0x1E0000 <MAX: 0x237C00 or 2,325,504B> RootFS -> Not Copied
0x417C00 <MAX: 0x008400> Non Volatile -> Not Copied/Used
As a rough overview to show in the grand scheme of things how this will look:
- The boot loader will first look for any boot-able data in SPI Flash, SPI Data flash, and NAND Flash. In our case, we have the AT91 BootStrap in Dataflash at an offset of zero (start of Data flash).
- The AT91 Bootstrap will initialize DRAM and then copy data from Data flash (in our case the Kernel at an offset of
0x7300
) into DRAM (at0x2200_0000
), and the Device Tree from Dataflash at an offset of0x4B00
to DRAM at0x2100_0000
. - Then the boot loader will initialize the environment for the Linux kernel (address of device tree in a register, etc) and pass execution to the kernel.
- Since the kernel is a zImage (self extracting kernel image), the kernel will uncompress itself, execute itself, read the device tree which was copied to RAM earlier, initialize various drivers based on the device tree entries, and lastly pass execution to whatever is relevant in the root file system.
First boot
First things first, let's get the board powered up with the Dataflash erased. Upon bootup the boot loader in ROM of the SAM9N12 will configure basic clocking and the DBGU
serial port on pins R5
and R6
to run at 115200 baud and output the text RomBOOT
. If you see this then it means a lot went right, such as power integrity, clocking, BGA packages soldered correctly, and of course no magic smoke means no shorts. Next we will look into getting setting up SAMBA so we can flash the AT91 Bootstrap to data flash.
SAM-BA
The heck is SAM-BA?
So we have our board working enough to get the in ROM bootloader running, great! But how do we actually get some code onto DataFlash, and what about booting into Linux? Most importantly, what about initializing DRAM?
Thankfully, almost all vendors release tools that interact with software in ROM. These tools can use the software in ROM to initialize DRAM, erase flash on DataFlash, NandFlash, NOR Flash, and even modify OTP (One Time Programmable) bits in the IC. Atmel's tool for the AT91 series of chips is called SAM-BA, of which they have two versions. First is an open source command line only version which while looking fantastic (yay for scripting), it doesn't support our old SAM9N12, and adding support doesn't seem trivial.
Next is the version we will be using, the original GUI version from here. The documentation for SAM-BA by Atmel is fantastic in terms of how to download it and get it running, so no use for me to say what's been said there other than giving a small summary.
SAM-BA works by you plugging in a USB cable from your desktop to the AT91SAM SOC's USB Device port (the board looks like a device from the perspective of your machine, not the opposite). The SAM-BA tool will communicate with the ROM based bootloader by using a mailbox-esque setup with some shared memory in SRAM of the SOC. The user can write "applets" (just ARM executables) which SAM-BA sends to DRAM and executes on the device. These applets can be used to initialize DRAM or modify NAND Flash. Atmel also helpfully includes an easy way to modify these applets and then recompile them.
DRAM and DataFlash
DRAM is necessary because almost all the applets are designed to run from DRAM instead of SRAM. When starting SAM-BA, the tool automatically initializes DRAM and seems to do a quick small check to ensure DRAM is working, but you can manually call DRAM initialization anyways.
You can also send a file (don't use just 0's or 1's, use random data as below) to DRAM and then have SAM-BA do a comparison to further verify DRAM's functionality with your board design.
# Create one megabyte of random data using DD.
dd if=/dev/urandom of=randomstuff.bin bs=1 count=4M
Next up we have to flash some data to dataflash (32 Megabits) to ensure that is working. Why not NAND flash since we have so much more of it? Even after ensuring the pin setup is correct (NAND Flash is on diffirent pins than DRAM as per evaluation board), communication fails. This is likely due to soldering issues of the BGA package.
Using the random data generated earlier, flash that to DataFlash (after running "Enable Dataflash") and then do a compare to ensure Dataflash is working reliably. Don't skip this or you may end up spending nights debugging the wrong thing as your storage medium being faulty leads to all sorts of very hard to track down bugs.
AT91 Bootstrap
In addition to SAM-BA, Atmel released AT91 Bootstrap, a secondary bootloader which handles clocking, DRAM initialization, and can even boot the Linux kernel directly without the need for U-Boot. They also made some good documentation on it, though I found the code itself to be fairly messy and poorly documented. Heck, it's even written in C89 style where all the variables are at the top of functions and using for(i = 0; i < 100; i++)
instead of style for(int i = 0; i < 100; i++)
for loops.
Our first goal is to just get bootstrap both compiling and running on the board, after which comes the Linux kernel and busybox.
Configuring
To download the source code of AT91 Bootstrap and configure it, just do a git clone;
# Clone the git repo of AT91 Bootstrap
git clone https://github.com/linux4sam/at91bootstrap.git
# Enter the directory of the git repo
cd at91bootstrap
This software by Atmel, and almost all other larger codebases that run on bare metal, use kconfig to configure itself (works with define's in C/C++ code and other various configuration). In here you can configure things like enabling the ability to direct boot the Linux kernel, where in dataflash the kernel and device tree is stored, where to copy them to in DRAM, and more.
To start off, we need to get the default configuration (defconfig) that represents our board. Since our board is based on the SAM9N12EK board from Atmel, we have to find the name of that defconfig. In the root of the repository we can do the following:
[hak8or@hak8or at91bootstrap_fiddle]$ find . -name "*sam9n12*defconfig"
./board/at91sam9n12ek/at91sam9n12eksd_linux_image_dt_defconfig
./board/at91sam9n12ek/at91sam9n12eksd_linux_image_defconfig
./board/at91sam9n12ek/at91sam9n12eknf_uboot_defconfig
./board/at91sam9n12ek/at91sam9n12eksd_uboot_defconfig
./board/at91sam9n12ek/at91sam9n12ekdf_linux_image_dt_defconfig
./board/at91sam9n12ek/at91sam9n12eknf_linux_image_defconfig
./board/at91sam9n12ek/at91sam9n12ekdf_linux_image_defconfig
./board/at91sam9n12ek/at91sam9n12eknf_linux_image_dt_defconfig
./board/at91sam9n12ek/at91sam9n12ekdf_uboot_defconfig
The plan is to boot linux directly (so no uboot) and from dataflash instead of nandflash (so df
instead of nf
). We will also be using a device tree (more on that later), in which case we also want the dt
acronym. What's left is at91sam9n12ekdf_linux_image_dt_defconfig
so running make at91sam9n12ekdf_linux_image_dt_defconfig
will create a config file in the root of the git repo that is just a copy of that defconfig
file which make menuconfig
will modify. Since we just want the bootstrap to compile and run for now, we only need to make two changes via menuconfig
. First is in slow clock configuration options
where you should uncheck Use External 32KHZ oscillator
because this board does not have that component populated.
Secondly is the DRAM configuration. The DRAM used in the evaluation kit is the MT47H64M16HR-3
while we use the W9751G6KB-25
. Their IC has eight banks while ours has only 4, with the remaining timing paremeters being usable, so all we need to change is the following;
[hak8or@hak8or at91bootstrap]$ git diff board/at91sam9n12ek/at91sam9n12ek.c
diff --git a/board/at91sam9n12ek/at91sam9n12ek.c b/board/at91sam9n12ek/at91sam9n12ek.c
index fee32d5..369dab8 100644
--- a/board/at91sam9n12ek/at91sam9n12ek.c
+++ b/board/at91sam9n12ek/at91sam9n12ek.c
@@ -74,7 +74,7 @@ static void ddramc_reg_config(struct ddramc_register *ddramc_config)
ddramc_config->cr = (AT91C_DDRC2_NC_DDR10_SDR9 // 10 column bits (1K)
| AT91C_DDRC2_NR_13 // 13 row bits (8K)
| AT91C_DDRC2_CAS_3 // CAS Latency 3
- | AT91C_DDRC2_NB_BANKS_8 // 8 banks
+ | AT91C_DDRC2_NB_BANKS_4 // 4 banks
| AT91C_DDRC2_DISABLE_RESET_DLL
| AT91C_DDRC2_DECOD_INTERLEAVED);
Toolchain
This is where many run into issues, how to handle the toolchain. The way this guide is set up uses a fairly painless process on how to handle the toolchains for this project. This is also all done in an Arch Linux based distro for simplicities sake (Ubuntu PPA's tend to be so old that you have to manually add a PPA, Arch uses AUR which has a huge amount of packages which are actually up to date since it's a rolling release). There are two types of compilers we will use, arm-none-***
and arm-linux-***
, the first of which being used for AT91 Bootstrap and SAM-BA applets while the second is for the Linux Kernel and cross compiling binaries running on the board under linux.
In Arch, simply doing pacman -S arm-none-eabi-gcc
will get you the newest (7.3.0
as of writing) toolchain, but under Ubuntu it seems doing apt get install arm-none-eabi-gcc
should also suffice. For arm-linux***
buildroot will be used (it fetches the toolchain and compiles everything for you, crazy stuff).
Compiling
Now that we have a compiler, we should be able to just run make CROSS_COMPILE=arm-none-eabi-
successfully.
[hak8or@hak8or at91bootstrap_fiddle]$ make CROSS_COMPILE=arm-none-eabi-
CC
========
arm-none-eabi-gcc 7.3.0
as FLAGS
========
-g -Os -Wall -I/home/hak8or/Desktop/armboard/at91bootstrap_fiddle/board/at91sam9n12ek -Iinclude -Icontrib/include -DJUMP_ADDR=0x22000000 -DTOP_OF_MEMORY=0x308000 -DMACH_TYPE=9999 -Dat91sam9n12ek -DMACH_TYPE=9999 -DTOP_OF_MEMORY=0x308000 -DCRYSTAL_16_000MHZ -DAT91SAM9N12 -mcpu=arm926ej-s -mtune=arm926ej-s -mfloat-abi=soft -DCONFIG_THUMB -mthumb-interwork -DCONFIG_AT91SAM9N12EK
gcc FLAGS
=========
-nostdinc -isystem /usr/lib/gcc/arm-none-eabi/7.3.0/include -ffunction-sections -g -Os -Wall -mno-unaligned-access -fno-stack-protector -fno-common -fno-builtin -I/home/hak8or/Desktop/armboard/at91bootstrap_fiddle/board/at91sam9n12ek -Icontrib/include -Iinclude -Ifs/include -I/home/hak8or/Desktop/armboard/at91bootstrap_fiddle/config/at91bootstrap-config -DAT91BOOTSTRAP_VERSION="3.8.10-rc1-dirty" -DCOMPILE_TIME="Tue Mar 27 02:26:14 EDT 2018" -DIMG_ADDRESS=0x00040000 -DIMG_SIZE= -DJUMP_ADDR=0x22000000 -DOF_OFFSET=0x00008400 -DOF_ADDRESS=0x21000000 -DMEM_BANK=0x20000000 -DMEM_SIZE=0x8000000 -DIMAGE_NAME="" -DCMDLINE="" -DCMDLINE_FILE="" -DTOP_OF_MEMORY=0x308000 -DMACH_TYPE=9999 -DCONFIG_DEBUG -DBANNER="\n\nAT91Bootstrap " AT91BOOTSTRAP_VERSION " (" COMPILE_TIME ")\n\n" -DCONFIG_HW_DISPLAY_BANNER -DCONFIG_HW_INIT -Dat91sam9n12ek -DMACH_TYPE=9999 -DTOP_OF_MEMORY=0x308000 -DCRYSTAL_16_000MHZ -DAT91SAM9N12 -mcpu=arm926ej-s -mtune=arm926ej-s -mfloat-abi=soft -DCONFIG_THUMB -mthumb -mthumb-interwork -DCONFIG_SCLK -DCONFIG_CRYSTAL_16_000MHZ -DCONFIG_CPU_CLK_400MHZ -DCONFIG_BUS_SPEED_133MHZ -DCPU_HAS_PIO3 -DCONFIG_AT91SAM9N12EK -DCONFIG_DDRC -DCONFIG_DDR2 -DCONFIG_RAM_64MB -DCONFIG_DATAFLASH -DCONFIG_LOAD_LINUX -DCONFIG_LINUX_IMAGE -DCONFIG_OF_LIBFDT -DCONFIG_DATAFLASH_RECOVERY -DCONFIG_SMALL_DATAFLASH -DAT91C_SPI_CLK=33000000 -DAT91C_SPI_PCS_DATAFLASH=AT91C_SPI_PCS0_DATAFLASH -DBOOTSTRAP_DEBUG_LEVEL=DEBUG_INFO -DCONFIG_DISABLE_WATCHDOG -DCPU_HAS_HSMCI0 -DCONFIG_SPI_BUS0 -DCONFIG_SPI
ld FLAGS
========
-nostartfiles -Map=/home/hak8or/Desktop/armboard/at91bootstrap_fiddle/binaries/at91sam9n12ek-dataflashboot-linux--dt-3.8.10-rc1.map --cref -static -T elf32-littlearm.lds --gc-sections -Ttext 0x300000
AS /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/crt0_gnu.S
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/main.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/board/at91sam9n12ek/at91sam9n12ek.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/lib/string.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/lib/eabi_utils.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/lib/div.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/lib/fdt.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/debug.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_slowclk.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/common.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_pio.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/pmc.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_pit.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_wdt.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_usart.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_rstc.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/ddramc.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/at91_spi.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/spi_flash.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/dataflash.c
CC /home/hak8or/Desktop/armboard/at91bootstrap_fiddle/driver/load_kernel.c
LD at91sam9n12ek-dataflashboot-linux--dt-3.8.10-rc1.elf
Size of at91sam9n12ek-dataflashboot-linux--dt-3.8.10-rc1.bin is 8336 bytes
[Succeeded] It's OK to fit into SRAM area
[Attention] The space left for stack is 14664 bytes
Booting
We need to copy the resulting boot.bin
(which is a symlink to t91sam9n12ek-dataflashboot-linux--dt-3.8.10-rc1.bin
) to dataflash. Do not use just "copy file" because the ROM bootloader needs some extra information. Using the "Save Boot File" applet in SAM-BA will generate and save this extra information.
And when we restart the board then on the serial port we should be seeing the following;
RomBOOT
AT91Bootstrap 3.8.10-rc1-dirty (Tue Mar 27 02:26:14 EDT 2018)
SF: Got Manufacturer and Device ID: 0x1f 0x27 0x1 0x1 0x0
SF: Press the recovery button (PB4) to recovery
SF: Failed to load image
Looks like we have AT91 Bootstrap compiling, running, and interpreting the Dataflash IC correctly! Next up is working with device tree.
Device Tree
Before Device Tree
Years ago when ARM for embedded linux was still in its infancy, all the vendors were putting their diffirent peripherals (HDMI driver, SPI driver, where DRAM gets mapped to, clock configuration, etc) in diffirent areas in the address space. The quickiest and dirtiest solution to this was simply hardcoding these changes in the kernel and in tree drivers for each SOC and design, which resulted in kernels images that were not portable across designs. Other information would also be passed to the kernel via registers during boot. This is why often times there would be a diffirent image to flash to an SD card for each board, and using the wrong image resulted in a barely, if at all, functional system.
Why does this seem not to be an issue for x86? Years ago when IBM was still making desktops, IBM wrote the BIOS (Basic Input Output System) which attempted to abstract away the underlying hardware and provide some functions for the OS, like setting the cursor position and writing text to the screen. Going further, IBM also made a standard (like what gets mapped to where in the address space) which resulted in the whole "IBM Compatible" PC world. It would initialize all the hardware and do crazy things like attempt to upload a program from the keyboard port into RAM. Most importantly though, the BIOS also reported to the OS what hardware was present on the system. Years later we got ACPI which did a better job of reporting and interacting with underlying hardware, and now we have the next incarnation of the BIOS called UEFI.
Going back to ARM and today, the lack of such hardware probing and reporting was a huge problem and kept getting worse as time went on because maintaining these seperate kernels was extremely time consuming. Vendors tended to release a modified kernel for the design and that's it, no updating to a newer kernel unless you had a huge amount of time on your hands to do it yourself. Not to mention, hardware vendors generally seem to tend to just throw hardware out in the wind with a "hard work is done, now you software people do everything" attitude, so this isn't too suprising. Needless to say, this was a total nightmare on ARM.
As the Linux Kernel maintainers saw this horror show unfurl itself in front of them gradually but steadly with no signs of ARM or vendors stepping up to fix this problem, Linus complained.
On Mon, Apr 18, 2011 at 8:17 AM, Alexey Zaytsev wrote:
Could you please just apologize for the pointless diffstat complain, so we could go on?
Ehh. They aren't pointless, and I'm this close to just stopping pulling from some people unless things improve.
Dear Russel.
Please don't take the offense. Linus might be a dickhead at times, and sometimes he's wrong, but I'm sure he did not mean to hurt you.
Umm. The "some people" who need to get their shit together was never Russell (and we've been emailing in private about it). We may not agree about every detail, but on the whole we're not at all butting heads.
Why do you think he posted that email with those arm statistics?
It's the machine/platform guys who are trouble.
Hint for anybody on the arm list: look at the dirstat that rmk posted, and if your "arch/arm/{mach,plat}-xyzzy" shows up a lot, it's quite possible that I won't be pulling your tree unless the reason it shows up a lot is because it has a lot of code removed.
People need to realize that the endless amounts of new pointless platform code is a problem, and since my only recourse is to say "if you don't seem to try to make an effort to fix it, I won't pull from you", that is what I'll eventually be doing.
Exactly when I reach that point, I don't know.
Linus
Post Device Tree
With Linus finally (and rightfully so) putting a halt on such nonsense from the vendors, the community got to work. Inspired by how Sun used Open Firmware on SPARC (before they got bought by Oracle) handled giving information to the kernel about what hardware is present on the system, Device Tree was born. Check out Open Firmware by the way, it's amazing, it even has a FORTH interpreter that can do TCP/IP and more in under 350 KB!
Device tree relies on you supplying a tree of nodes with each node specfying parameters that the associated driver for the node may for configuration. The vendor gives a .dtsi
file which describes all the peripherals the SOC has such as where the SPI register interface is located in the address space and what type of driver to use for the SPI peripheral. For mainline linux, this file is provided in the arch/arm/boot/dts/
folder, in our case being at91sam9n12.dtsi. As an example, here is what the SPI node looks like. Notice it includes the address of the register block associated with the SPI register, what it's compatible with (which driver to use), the clock source, and most importantly setting it's status to disabled.
spi0: spi@f0000000 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "atmel,at91rm9200-spi";
reg = <0xf0000000 0x100>;
interrupts = <13 IRQ_TYPE_LEVEL_HIGH 3>;
dmas = <&dma 1 AT91_DMA_CFG_PER_ID(1)>,
<&dma 1 AT91_DMA_CFG_PER_ID(2)>;
dma-names = "tx", "rx";
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_spi0>;
clocks = <&spi0_clk>;
clock-names = "spi_clk";
status = "disabled";
};
When the kernel boots and parses the Device Tree, it finds the "compatible" field for each node and searches for drivers which have been both compiled into the kernel and advertise themselves as matching with the "compatible" node field. In this case, the kernel will look for a driver that has been compiled in which can work with a "at91rm9200-spi" driver for the "Atmel" SOC. But, since the status is disabled, the driver won't actually load.
This is where a .dts
file comes in. Our board is based on the SAM9N12EK from Atmel, so we should be able to base our .dts
file on the already provided at91sam9n12ek.dts. Looking at the SPI node here, we get this;
spi0: spi@f0000000 {
status = "okay";
cs-gpios = <&pioA 14 0>, <0>, <0>, <0>;
m25p80@0 {
compatible = "atmel,at25df321a";
spi-max-frequency = <50000000>;
reg = <0>;
};
};
A .dts
file is meant to say what peripherals are present and used in on a board. Notice at the top of the file it says #include "at91sam9n12.dtsi"
, which pulls in all the nodes, including the status = "disabled";
mentions. Conflicting defenitions (status = "disabled/Okay";
for example) will be overwritten with the most recent mention (the .dts
file). In effect, the "top" file will overlay itself over older files and overwrite conflicting node parameters. Therefore, when in the .dts
file a peripheral is marked with status = "okay";
this will take priority over any older mentions, in effect telling the kernel to enable the driver for the peripheral. The driver then comes in and using the other information in the node will configure the peripheral, in this case being the Atmel SPI driver.
The driver advertises itself as compatible here and pulls in which pins to use as chip select from the Device Tree here.
Lastly, the device tree is not stored as just a big text file on the device. It's compiled using dtc
into a single .dtb
(device tree binary) file. When the kernel is booting, the address of this file is provided in one of the CPU registers. As time went on, the community has done an amazing job porting much of the old code into a device tree complaint format, as shown here in another probably better writeup. Also, the kernel as usual has some solid documentation on device tree usage. There is also an official standard. The key take away from this is that a device tree is just a way of describing to the kernel what hardware is present, where in the address map it is, what driver to use for interacting with it, and various optional parameters.
Custom Device Tree
Time to make our own device tree. We know we want the following;
- Serial Port for interacting with the device
- SPI port for Dataflash (root file system is not copied to RAM)
- Dataflash IC itself on SPI0 peripheral
- Partition the Dataflash for all our data
- USB Host (it's only 4 lines, will use this for a Wifi dongle)
At the top level we have a memory node to tell the kernel where memory is and how much of it the Kerenl is allowed to use, what clock sources there are, and then peripherals mapped onto various busses. In ARM there are a few busses as per spec, in our case the USB peripheral is directly on the AHB bus. From the AHB bus branches off a slower APB bus to which the SPI peripheral is attached to. The SPI bus has only one device, an AT45
based Dataflash IC, in which flash memory is mapped using MTD across 5 partitions. Each partition node has a "reg" field which has two arguments, the offset and how large this partition is. Note that if you want to be able to write to a partition, it must be aligned to a page boundary (528 bytes per page by default for the AT45DB321E
).
/*
* at91sam9n12ek.dts - Device Tree file for AT91SAM9N12-EK board
*
* Copyright (C) 2012 Atmel,
* 2012 Hong Xu <hong.xu@atmel.com>
*
* Licensed under GPLv2 or later.
*/
/dts-v1/;
#include "at91sam9n12.dtsi"
/ {
model = "Atmel AT91SAM9N12-EK";
compatible = "atmel,at91sam9n12ek", "atmel,at91sam9n12", "atmel,at91sam9";
memory {
reg = <0x20000000 0x4000000>;
};
clocks {
main_xtal {
clock-frequency = <16000000>;
};
};
ahb {
apb {
dbgu: serial@fffff200 {
status = "okay";
};
spi0: spi@f0000000 {
status = "okay";
cs-gpios = <&pioA 14 0>, <0>, <0>, <0>;
flash@0 {
status = "okay";
compatible = "atmel,at45";
spi-max-frequency = <25000000>;
reg = <0>;
partitions {
compatible = "fixed-partitions";
#address-cells = <1>;
#size-cells = <1>;
partition@0 {
label = "AT91Bootstrap";
reg = <0x00 0x26D8>;
read-only;
};
partition@2800 {
label = "DeviceTree";
reg = <0x2800 0x4B00>;
read-only;
};
partition@7300 {
label = "zImage";
reg = <0x7300 0x1D8D00>;
read-only;
};
partition@1E0000 {
label = "RootFS";
reg = <0x1E0000 0x237C00>;
read-only;
};
partition@417C00 {
label = "NonVolatile";
reg = <0x417C00 0x8400>;
};
};
};
};
};
usb0: ohci@500000 {
num-ports = <1>;
status = "okay";
};
};
};
Chosen Node
Sometimes you might spot a node in the top level called "chosen" which includes the bootargs used to tell the kernel where the rootfs is and more. For example, the AT91SAM9N12EK.dts
file in the kernel says, via the "chosen" node, that there is a root file system in the the second partition of flash which is both readable and writable under the JFFS2 file system. Also, all output going to stdout will be put in the first serial port at a baud rate of 115200 with no parity bits and 8 bits per character.
chosen {
bootargs = "root=/dev/mtdblock1 rw rootfstype=jffs2";
stdout-path = "serial0:115200n8";
};
The kernel by default attempts to use the boot arguments from the bootloader, so this would be discarded, hence not including it here. Furthermore, AT91Bootstrap seems to have a bug where if you disable the "Override the config kernel command-line" option and enable copying a Device Tree to DRAM, it fails when checking bootargs by claiming it's a null terminated empty string.
Passing to the Kernel
When the kernel boots, it expects the address of where the device tree begins in one of the CPU registers, and therefore be exposed in the address space. Since the AT91SAM9N12 does not expose/map the contents of Dataflash from the SPI periperhal in the address space (not to mention that the kernel will attempt to configure the SPI peripheral by starting up it's driver), it must be copied to DRAM. The only way to do this is to inform the bootloader (AT91 Bootstrap) that we want to copy a device tree from Dataflash to DRAM. This will be shown next, after we get Linux compiling.
Buildroot
The heck is Buildroot?
The Old School guide shows how to set the kernel and root file system up by hand which, while a great learning experience, is extremely tedious and error prone. A few years ago amongst an enormous mass of custom shell and python scripts in an attempt to automate this process, the community decided to make BuildRoot. This automates all of the following for you;
- Downloading and compiling a toolchain
- Downloading and cross compiling the kernel, Busybox, packages for rootFS
- Dependency management for packages
- Wrappers for menuconfig of Busybox, Linux, and more
- Optionally can handle AT91 Bootstrap and U-Boot
- Creates a root file system (and even compresses it)
- Wrapped up in makefiles which support
nconfig
andxconfig
In short, Busybox is an amazing tool that takes all the pain out of setting up a Linux based system. What about Yocto? I didn't get a chance to actually try it out yet, so Buildroot it is!
Going back to our original goal, we want a minimal system that can fit in under 4 Megabytes. This means no xorg (no graphical interface), no package manager, no Python or Ruby, only the basic necessities. Conceptually we need three different items, the Linux Kernel, the Root File System (ROOTFS, contains Busybox and our applications), and the Device Tree. Since the root file system will remain on the SPI based device flash, the kernel needs the driver for the SPI peripheral on our SOC.
To get Buildroot, you can use the site and extract the archive.
# Download the package
wget https://buildroot.org/downloads/buildroot-2018.02.1.tar.gz
# Extract the File (tar -xf, lack of z flag autodetects file type)
tar -xf buildroot-2018.02.1.tar.gz
Defconfig
Buildroot also makes use of defconfig's (default configurations), which you can find by just doing a simple find
.
[hak8or@hak8or buildroot-2018.02.1]$ find . -name "*defconfig" | grep "at91"
./output/build/linux-headers-4.13/arch/arm/configs/at91_dt_defconfig
./output/build/linux-4.13/arch/arm/configs/at91_dt_defconfig
./configs/at91sam9x5ek_mmc_dev_defconfig
./configs/at91sam9260eknf_defconfig
...
Sadly there is no defconfig for our specific board. Usually it's sufficient to just find and adapt the closest defconfig and use that instead, specifically using the same CPU core (AT91SAM9N12
uses an ARM926EJ-S
), but let's try to do this ourselves to see what a defconfig could include. Afterwards, the current configuration of buildroot can be saved as a defconfig which will allow to use a simple make defconfig BR2_DEFCONFIG=/home/hak8or/BrainyV2_buildroot_minimal_defconfig
to automatically select everything mentioned below.
Our Tiny Defconfig
As said earlier, this SOC uses an ARM926EJ-S
based core running at 400 Mhz with 64 MB of DDR2 SDRAM. When running make nconfig
at the root of the Buildroot directory, go to Target Options ---> Target Architecture
and select ARM Little Endian
. To set the actual CPU core, go to Target Options ---> Target Architecture Variant
and select arm926t
(yes, we actually have the "EJ-S" variant but this works fine).
We want the Linux kernel of course, so in the Kernel --->
tab enable the kernel which should populate the kernel version field with 4.15
(as of Q1 2018). Instead, change the Kernel Version
field to Custom version
and type in 4.15.18
. This is because later we will apply a few patches to the kernel. The linux kernel itself does have a default config for our SOC so select in-tree defconfig file
and type at91_dt
in the "Defconfig Name" field. This configuration has the kernel include various drivers for Atmels AT91SAM series of IC's and assumes a device tree will be appended to the kernel image.
For the Toolchain --->
tab, select both Enable C++ support
because we will need that later for many applications. I would suggest also selecting to use the latest binutils
and GCC
.
Next up is the device tree we made earlier. Create a file called BrainyV2.dts
in the Linux kernel source folder (in my case being output/build/linux-4.15.7/arch/arm/boot/dts/
) containing that device tree. In the Kernel tab, enable Build a Device Tree Blob
, select Use a device tree present in the kernel
and for the file name put BrainyV2
(yes, there is no file extension included even though it says file name). This will have the kernel build process create a file in the output/images
folder for the compiled device tree with the .dtb
file extension instead of us having to do this manually.
We will be making use of a Wifi dongle which requires a 3rd party, closed source, firmware binary blob. The linux kernel keeps these out of it's tree and instead has a separate tree just for these here. Devices like Wifi dongles, GPU's, and other "complicated" hardware often times have firmware which must be loaded at boot, and sadly most of the time it's closed source. Inside buildroot we can specify for it to download the binary blob and put it into a standard location (/lib/firmware
) by going in Target Packages->Hardware Handling->Firmware->Linux-firmware->Wifi firmware->Atheros 9271
.
Lastly, we only have 4 MB for everything so compression is extremely important. For the root file system we do not need it to be writable, so we can use SquashFS. This file system was designed specifically for space constrained systems and has tons of awesome features but has one large issue, it's read only. Also, the xz compression format seems to do the best on my system in terms of compression ratio. To enable the root file system to use squashfs with xz, go to Filesystem images --->
and select squashfs root filesystem
with Compression algorithm (xz)
. For the kernel, go to Kernel --->
and ensure the Kernel binary format
is a zImage
with the compression set to zx compression
.
To summarize, here is what you need to change:
Buildroot nconfig
Target Options
Target Architecture = ARM Little Endian
Target Architecture Variant = arm926t
Kernel
Enabled
Kernel Version = Custom Version (4.15.18)
Kernel Configuration = in-tree defconfig file
Defconfig name = at91_dt
Kernel compression format = xz compression
Build a Device Tree Blob = Checked
Use a device tree present in the kernel
Device Tree Source file names = BrainyV2
Filesystem images
tar the root filesystem = unchecked
squashfs = checked
Compression algorithm = xz
Toolchain
Binutils Version = 2.3
GCC compiler Version = 7.x
Enable C++ support = checked
Target Packages
Hardware Handling
Firmware
Linux-firmware
WiFi firmware
Atheros 9271 = checked
Run make
and wait a while. This will download and compile for GCC (our cross compiler), the Linux kernel, Busybox, tools for the host, create a root file system, and lastly create images for both the Kernel and the root file system. This will take a pretty long time (half an hour on an I5-3570k OC'd to 4.6 Ghz running from an SSD and 4.5 GB RAM), so go grab an espresso in the meantime.
Size Restrictions
At this point you should have a kernel and root file system that is only 3.9 Megabytes
large.
[hak8or@hak8or buildroot-2018.02]$ ls -lh output/images/
total 3.9M
-rw-r--r-- 1 hak8or users 18K Apr 2 17:34 at91sam9n12ek_custom.dtb
-rw-r--r-- 1 hak8or users 1.2M Apr 2 17:34 rootfs.squashfs
-rw-r--r-- 1 hak8or users 2.7M Apr 2 17:34 zImage
This has everything we need to boot, but remember that we only have 4 MB to play with. Given our earlier size constraints for the root file system (2.217 MB
) and kernel (1.846 MB
), the kernel (2.7M
) won't fit. Next up is looking into what we can remove in order to decrease in size.
zImage minifying
Measurements
As shown earlier, we have a 2.7 MB
zImage (kerenel image) when our size limit is 1.846 MB
, and our root file system is 1.2 MB
which is under the max of 2.217 MB
. There are two points of confusion here:
- Why is the root file system so large? It should only have busybox with
libc
andlibc++
which when combined shouldn't be over a megabyte. - What can we remove from the zImage to get the kernel size down while still have a decently functional system?
Lets look at the root file system first since Buildroot provides make graph-size
. This will let us see what is occupying so much space in our rootfs via nice plots and csv files in the output/graphs
folder. Keep in mind that "Total filesystem size" is the uncompressed version, meaning no compression has been applied. After pumping the root filesystem through squashfs and then compressing it with xz it drops from 3.5 MB
down to 1.2 MB
.
Hm, that is unusual, what the heck is "Linux"? The kernel isn't in the root file system (it's put in a totally separate area in data flash). Unknown is also worth looking into. Thankfully Buildroot also generates a file called file-size-stats.csv
. The contents help give us more information, of which a trimmed version is below.
[hak8or@hak8or graphs]$ cat file-size-stats.csv | awk '{gsub("lib/modules/4.15.7/kernel/drivers/", "..."); print}' | column -s, -t | grep -v " 4096 " | sort -n -k 3 | tail -n 10
File name Package name File size Package size File size in package (%) File size in system (%)
...net/wireless/realtek/rtlwifi/rtl8192c/rtl8192c-common.ko linux 48268 1130774 4.3 1.3
...net/wireless/ralink/rt2x00/rt2x00lib.ko linux 48736 1130774 4.3 1.3
...net/wireless/ralink/rt2x00/rt2800usb.ko linux 49804 1130774 4.4 1.4
...net/wireless/marvell/libertas/libertas.ko linux 65316 1130774 5.8 1.8
...net/wireless/realtek/rtlwifi/rtlwifi.ko linux 75804 1130774 6.7 2.1
...net/wireless/realtek/rtlwifi/rtl8192cu/rtl8192cu.ko linux 82528 1130774 7.3 2.3
...net/wireless/ralink/rt2x00/rt2800lib.ko linux 96488 1130774 8.5 2.7
...net/wireless/marvell/mwifiex/mwifiex.ko linux 275384 1130774 24.4 7.6
lib/libuClibc-1.0.28.so uclibc 489384 562728 87.0 13.5
bin/busybox busybox 719316 722918 99.5 19.9
This tells us that a decent portion of the files seem to be device drivers for various USB based wireless dongles, and they all come from the Linux package.
Kernel
The Linux kernel is a monolithic kernel, meaning the entire Operating System is running in kernel space, including various device drivers. Therefore, device drivers tend to be included in the kernel source code, in our case being the USB wireless dongles, hence these drivers being marked as coming from the "Linux" package. Here is a great guide on how to measure and tune the size of the kernel, some of which we will be using here. When compiling the Linux kernel you can specify if you want various components to be compiled into the image (zImage in our case) or as "Modules" which get loaded at run time from the root file system. For example, in the below image of the kernel configuration, the "Marvel WiFi-Ex" drivers are compiled as modules while the "Realtek rtlwifi" drivers are compiled into the kernel image. This is why you see the Marvel drivers in the above snippet, since they are in the root file system instead of the kernel image.
Buildroot lets you access various packages (one of which is Linux) through the make tool in the format of make *packagename*-make/menuconfig/nconfig/clean/rebuild
. For example, to view the menuconfig of the Linux kernel using nconfig, then you should use make linux-nconfig
. Looking around in there, we can see there are many options which are enabled, such as networking and these device drivers.
As a reminder, our goal is to have our system do the following:
- Boot to a shell and be able to communicate with it over serial
- Use the RNX-N150HG USB Wifi dongle to talk to the outside world
- Networking support
- Read only file system with compression (SquashFS)
- Run various tools (htop, stress, tmux, ping)
- If possible, an ssh server and TCC to compile a small C based demo program
Things we do not need:
- Video output (no Xorg or VGA or DVI)
- Audio output (no I2S)
- Any file system other than SquashFS
Non Relevant Wireless USB drivers
We only need to support the Atheros AR9002U
chipset and Atheros AR9271
wireless chip. The driver for this is called ath9k_htc Drivers according to the wikidev site. It is not clear on what this driver is listed under in the kernel configuration, so simply searching for symbols with "ath9k" when using the nconfig viewer gives this.
This tells us we need to go into Device Drivers -> Network device Support -> Wireless LAN
and enable the Atheros/Qualcomm devices
node to enable the ATH9k
driver. This is a general driver for the Atheros and Qualcomm interface though, not the specific chipset we are using. If I can't find the information I need via a symbol search then usually searching the kernel mirror on Github will give what you need. In our case, searching for the AR9271
keyword in the repository gives us this, showing that we also need to enable ATH9k_HTC
. If we enable these two components and disable all other entries in Wireless LAN
then the size of the zImage is still 2.7MB
but the compressed rootfs is only 848KB
, which is a savings of 380KB
compared to the previous 1.2MB
!
Non Relevant kernel modules
While our root file system dropped to satisfactory levels, our kernel image is still far too large. Here is a chart showing all the kernel functionality which can be removed while still being able to satisfy our intended functionality.
Component Name | zImage size in kB |
---|---|
Graphics support (DRM, backlight, logo, framebuffer) | 169 kB |
ext4 in Fle Systems | 138 kB |
Network FIle Systems in File systems | 112 kB |
Atheros and HTC driver as Module (+116 kB to RootFS) | 94 kB |
Atheros and HTC driver as in Kernel | 91 kB |
soundcard support | 84 kB |
Miscellaneous file systems (UBIFS) in File systems | 77 kB |
Multimedia support | 62 kB |
SCSI device support | 56 kB |
MMC SD SDIO card support | 46 kB |
Enable Stack unwinding support | 39 kB |
UBI Support | 32 kB |
NAND device support in Memory Technology Device (MTD) support | 28.5 KB |
USB Gadget | 28 kB |
HID | 27.8 kB |
vfat in DOS/FAT/NT file systems | 20 kB |
Industrial IO support | 19 kB |
Suspend to RAM and standby | 16.6 kB |
Ethernet Driver in Network device support | 15 kB |
Initial RAM disk/file system | 14.5 kB |
EHCI HCD | 14.4 kB |
Voltage and current regulator support | 14 kB |
PHY Device support in Network Device Support | 13 kB |
I2C Support | 12.8 kB |
Real Time Clock | 11.5 kB |
all input device support | 11 kB |
USB Serial Converter | 11 kB |
SquashFS with only XZ (removed other compression) | 6.7 kB |
PPS + PTP | 5 kB |
PWM Support | 5 kB |
USB Modem (CDC ACM) | 5 kB |
Watchdog timer support | 3.7 kB |
Power supply class support | 3.5 kB |
NVMEM | 2.3 kB |
Board level reset or power off | 2kB |
Atmel HLCDC (High-end LCD Controller) | 1.7 kB |
MDIO Bus Device Drivers | 1.5 kB |
Atmel SOC AT91RM9200 | 1 kB |
Verbose user fault messages | 0.5 kB |
After removing all of these and having the WiFi drivers in the kernel instead of as modules, we have a file size as follows;
[hak8or@CT108 buildroot-2018.02.1]$ ls -la --block-size=k output/images/
total 2673K
drwxr-xr-x 2 hak8or hak8or 1K May 2 05:00 .
drwxr-xr-x 6 hak8or hak8or 1K May 2 03:25 ..
-rw-r--r-- 1 hak8or hak8or 18K May 2 05:00 at91sam9n12ek_custom.dtb
-rw-r--r-- 1 hak8or hak8or 844K May 2 05:00 rootfs.squashfs
-rw-r--r-- 1 hak8or hak8or 1748K May 2 05:00 zImage
What's else
When the kernel gets compiled many object files (with a .o
extension) get generated. These are files which get put linked into the kernel image during the compilation process. We can use these files to get a rough estimate of what's taking up space. The zImage is well below our limit, so we do not need to change anything here, this is just for curiosity's sake. As we can see, a large portion of this is networking related (ipv4 and ipv6 stack), and some are various drivers used for wifi.
[hak8or@hak8or build]$ size */built-in.o | sort -n -r -k 4 | head -n 30
102937 93 1040 104070 19686 linux-4.15.7/net/wireless/nl80211.o (ex linux-4.15.7/built-in.o)
47725 1529 492 49746 c252 linux-4.15.7/net/core/dev.o (ex linux-4.15.7/built-in.o)
15770 216 28056 44042 ac0a linux-4.15.7/kernel/printk/printk.o (ex linux-4.15.7/built-in.o)
41269 536 1176 42981 a7e5 linux-4.15.7/net/ipv6/addrconf.o (ex linux-4.15.7/built-in.o)
36370 13 8 36391 8e27 linux-4.15.7/net/ipv4/tcp_input.o (ex linux-4.15.7/built-in.o)
35840 314 0 36154 8d3a linux-4.15.7/net/core/skbuff.o (ex linux-4.15.7/built-in.o)
33790 30 0 33820 841c linux-4.15.7/net/mac80211/mlme.o (ex linux-4.15.7/built-in.o)
28294 287 2128 30709 77f5 linux-4.15.7/drivers/tty/vt/vt.o (ex linux-4.15.7/built-in.o)
28541 675 1 29217 7221 linux-4.15.7/net/ipv6/route.o (ex linux-4.15.7/built-in.o)
28037 68 1044 29149 71dd linux-4.15.7/net/core/rtnetlink.o (ex linux-4.15.7/built-in.o)
28582 156 12 28750 704e linux-4.15.7/drivers/usb/core/hub.o (ex linux-4.15.7/built-in.o)
27864 9 0 27873 6ce1 linux-4.15.7/net/mac80211/tx.o (ex linux-4.15.7/built-in.o)
26977 360 0 27337 6ac9 linux-4.15.7/crypto/aes_generic.o (ex linux-4.15.7/built-in.o)
26657 9 0 26666 682a linux-4.15.7/fs/namei.o (ex linux-4.15.7/built-in.o)
25811 533 4 26348 66ec linux-4.15.7/net/core/filter.o (ex linux-4.15.7/built-in.o)
25857 287 2 26146 6622 linux-4.15.7/net/packet/af_packet.o (ex linux-4.15.7/built-in.o)
26055 0 0 26055 65c7 linux-4.15.7/lib/crc32.o (ex linux-4.15.7/built-in.o)
Next up is fixing an issue with USB seemingly not working.
USB
Our board has two USB ports, one hardwired to be a USB slave (connect to a USB Host like your desktop), and the other being a USB OTG port (can run as both a slave and host). The underlying peripheral hardware in the SOC is designed to work with the OHCI standard. Since this port is USB 2.0, this results in the registers used to access the device being the same across all OHCI implementations and therefore being able to use a common driver, hence enabling the OHCI driver in the kernel.
So, we should be able to just plug in our USB wifi dongle and it should work, right?
# echo "Hello world"
Hello world
# usb 1-1: new full-speed USB device number 2 using at91_ohci
usb 1-1: device descriptor read/64, error -62
usb 1-1: device descriptor read/64, error -62
usb 1-1: new full-speed USB device number 3 using at91_ohci
usb 1-1: device descriptor read/64, error -62
usb 1-1: device descriptor read/64, error -62
usb usb1-port1: attempt power cycle
usb 1-1: new full-speed USB device number 4 using at91_ohci
usb 1-1: device not accepting address 4, error -62
usb 1-1: new full-speed USB device number 5 using at91_ohci
usb 1-1: device not accepting address 5, error -62
usb usb1-port1: unable to enumerate USB device
echo "aw :("
aw :(
We see a few things here.
- The correct driver (at91_ohci) is being used for the peripheral.
- The correct USB port is being used.
- Correctly recognizing USB full-speed capability.
- Driver is giving USB device an address during enumeration but device fails.
- Device descriptor read (part of enumeration process) is failing.
The last point shouldn't ever really happen, it means the Host is asking the device for simple data but the host is not getting anything back. Usually this means there is a communication issue, like a bad cable or the USB slave is genuinely not replying (maybe failed). So what the heck is error -62
? Searching around the kernel source we will hit errno.h
which gives an error code for various potential errors. In our case the error is "Timer expired" as seen here. Then we have this helpful guide which confirms our suspicion, the explanation for the code is "No response packet received within the prescribed bus turn-around time.".
Debugging
So we know the correct driver is being loaded on the correct USB port, and the USB device is being recognized as a full speed capable. The full speed part is important, as it means that the D+
connection is being received correctly (USB full speed devices have a 1.5k ohm pullup resistor on D+ as per standard), therefore it is unlikely to be a pin mux'ing issue. The USB device does work in a laptop or desktop, so it's definitely not dead. Plugging in other USB devices gives the same issue. What else can we do?
Logic Sniffer
Enter a Logic Sniffer. This is an indispensable tool which should be sitting right next to your oscilloscope. Normally most places that are willing to invest into their equipment have at least one very nice oscilloscope with the ability to decode signals like SPI/I2C/Serial and more. But the fancier scopes that sample fast enough can decode the fancier protocols like USB and even PCI-E, but they are fairly pricey (easily $10,000 and more). Instead, logic sniffers can be used which tend to be cheaper because all they do is measure HIGH/LOW states, not the signal in 256 (or more) voltage levels.
Thankfully there are logic sniffers out there like the DSLogic which can measure signals up to 400 Mhz for only $100 and work with the amazing open source tool Sigrok. The best part about Sigrok is it's open source and actually works, including the moderately intuitive Pulse View that lets you graphically view the signal and has many decoders built in, including USB 2.0! We want to view the D+/D- pair from the USB bus, so I grabbed a random USB device I had laying around and soldered in some wires. Yes this totally ruins signal integrity but all we want to do is get a rough idea of what's going on.
Thankfully there is an example capture for USB communications, including the pairing process, so we have something to compare to.
Yeah, that looks somewhat wonky, why are the potentially decoded bits of our signal not on the edge transitions and instead over more than 1 transition? Why is the decoded packet extending way beyond our data transitions and, most importantly, why is the raw packet (not decoded, just the bits) taking twice as long as our known to be good capture? Let's look at how long the bit duration is.
The heck? Our bit duration is roughly 167 nanoseconds while the known to be good capture says 80 nanoseconds. This is clearly a timing issue on our board. Seeing as how it's a nice clean half speed, maybe it's a clock divider being set incorrectly somewhere. For USB 2.0 Full speed, most peripherals need a 48 Mhz clock input, so what might be happening is the USB peripheral is being fed a 24 Mhz clock instead of 48 Mhz, which would explain the half speed.
USB Clocking
Clocking in general these days for even small micro controllers is not trivial, with a clock tree consisting of many different nodes each with it's own frequency limitations. To turn a peripheral on or off usually consists of "gating off" a device, meaning to enable or disable the clock for a peripheral. Furthermore, you have many multiplexers to control what part of the clock tree gets it's clock from what source, and then throw in dividers to make it even more fun. To put it simply, clock trees are complicated.
For this SOC, the relevant parts of the clock tree are as follows:
16 MHz oscillator -- (MAINCK) --> PLLA
\-----> PLLB (Dedicated for USB)
\-- (PLLBCK) --> USB Clock controller
\-- (UHPCLK) --> USB Host
Looking at this, we are expecting PLLB to be generating 48 Mhz or higher from a 16 Mhz frequency, therefore setting the PLL multiplier (actually divider based on how a PLL works) of at least 3x. There are a few spots where we can look into what the code assumes the various clocks are set to, one of which being in drivers/usb/host/ohci-at91.c
here.
// drivers/usb/host/ohci-at91.c
static void at91_start_clock(struct ohci_at91_priv *ohci_at91)
{
if (ohci_at91->clocked)
return;
clk_set_rate(ohci_at91->fclk, 48000000);
clk_prepare_enable(ohci_at91->hclk);
clk_prepare_enable(ohci_at91->iclk);
clk_prepare_enable(ohci_at91->fclk);
ohci_at91->clocked = true;
}
What's hclk
, iclk
, and fclk
you ask? Doesn't say there, so let's try to find the struct definition of ohci_at91
!
struct ohci_at91_priv {
struct clk *iclk;
struct clk *fclk;
struct clk *hclk;
bool clocked;
bool wakeup; /* Saved wake-up state for resume */
struct regmap *sfr_regmap;
};
Great, minimal comments. But wait, turns out that if those clocks fail to be found then the driver reports an error!
/**
* usb_hcd_at91_probe - initialize AT91-based HCDs
* Context: !in_interrupt()
*
* Allocates basic resources for this USB host controller, and
* then invokes the start() method for the HCD associated with it
* through the hotplug entry's driver_data.
*/
static int usb_hcd_at91_probe(const struct hc_driver *driver,
struct platform_device *pdev)
{
// .....
ohci_at91->iclk = devm_clk_get(dev, "ohci_clk");
if (IS_ERR(ohci_at91->iclk)) {
dev_err(dev, "failed to get ohci_clk\n");
retval = PTR_ERR(ohci_at91->iclk);
goto err;
}
ohci_at91->fclk = devm_clk_get(dev, "uhpck");
if (IS_ERR(ohci_at91->fclk)) {
dev_err(dev, "failed to get uhpck\n");
retval = PTR_ERR(ohci_at91->fclk);
goto err;
}
ohci_at91->hclk = devm_clk_get(dev, "hclk");
if (IS_ERR(ohci_at91->hclk)) {
dev_err(dev, "failed to get hclk\n");
retval = PTR_ERR(ohci_at91->hclk);
goto err;
// .......
}
So all we really learn from this is fclk
seems to be uhpck
which is after both the PLL and USB peripheral clock divider.
Common Clock Framework
Linux has a framework to work with clock trees, including handling dependencies and propagating rate changes to relevant nodes. Each clock has an operations struct containing function pointers to various supported USB operations. In our case, we have operations defined for both PLLB (PLL dedicated for USB) and the USB peripheral clock divider. LWN has a fantastic article on this framework.
// linux/drivers/clk/at91/clk-usb.c
static const struct clk_ops at91sam9n12_usb_ops = {
.enable = at91sam9n12_clk_usb_enable,
.disable = at91sam9n12_clk_usb_disable,
.is_enabled = at91sam9n12_clk_usb_is_enabled,
.recalc_rate = at91sam9x5_clk_usb_recalc_rate,
.determine_rate = at91sam9x5_clk_usb_determine_rate,
.set_rate = at91sam9x5_clk_usb_set_rate,
};
// linux/drivers/clk/at91/clk-pll.c
static const struct clk_ops pll_ops = {
.prepare = clk_pll_prepare,
.unprepare = clk_pll_unprepare,
.is_prepared = clk_pll_is_prepared,
.recalc_rate = clk_pll_recalc_rate,
.round_rate = clk_pll_round_rate,
.set_rate = clk_pll_set_rate,
};
Looking around, it looks like the PLL peripheral itself is being modified only through clk_pll_prepare()
instead of clk_pll_set_rate()
. This is to handle when the PLL peripheral is not yet enabled (powered off) but wanting to modify the divisor and multiplier. When powered off, register access would result in bus faults. The USB clock divider instead has it's peripheral changed immediately in it's set_rate()
.
static int clk_pll_set_rate(struct clk_hw *hw, unsigned long rate,
unsigned long parent_rate)
{
struct clk_pll *pll = to_clk_pll(hw);
long ret;
u32 div;
u32 mul;
u32 index;
ret = clk_pll_get_best_div_mul(pll, rate, parent_rate,
&div, &mul, &index);
if (ret < 0)
return ret;
pll->range = index;
pll->div = div;
pll->mul = mul;
return 0;
}
// ....
static int clk_pll_prepare(struct clk_hw *hw)
{
struct clk_pll *pll = to_clk_pll(hw);
// ...
regmap_update_bits(regmap, offset, layout->pllr_mask,
pll->div | (PLL_MAX_COUNT << PLL_COUNT_SHIFT) |
(out << PLL_OUT_SHIFT) |
((pll->mul & layout->mul_mask) << layout->mul_shift));
while (!clk_pll_ready(regmap, pll->id))
cpu_relax();
return 0;
}
The Issue
When a rate change is requested, then the request gets propagated to relevant nodes in the clock tree. In our case, we are modifying fclk
which is actually the USB peripheral after the PLL and USB clock divider. To get the clock of the node, the operation recalc_rate()
is called. For the PLL, we see that the cached values (from the struct) are not being used, and instead the hardware itself is queried.
static unsigned long clk_pll_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
{
struct clk_pll *pll = to_clk_pll(hw);
unsigned int pllr;
u16 mul;
u8 div;
regmap_read(pll->regmap, PLL_REG(pll->id), &pllr);
div = PLL_DIV(pllr);
mul = PLL_MUL(pllr, pll->layout);
if (!div || !mul)
return 0;
return (parent_rate / div) * (mul + 1);
}
While recalc_rate()
is documented in the kernel as querying the hardware, in our case it is not in sync if it is called between set_rate()
and prepare()
. Therefore, when the USB divider has to be configured during the requested rate propagation, it will call recalc_rate()
and will get a non sync'd clock rate back.
In our case, PLLB is being configured to run at 96 Mhz on boot before this call. Then the OHCI driver requests a 48 Mhz clock for the USB peripheral, which has the PLL set it's cached MUL and DIV values (via set_rate()
) appropriately. Then the USB clock divider queries it's parent node (the PLL) for it's clock rate, which will return the current hardware configured clock (96 Mhz in this case). The PLL divider sets it's divider to /2 to get 48 Mhz.
Then the OHCI driver calls prepare_and_enable()
on the clock, resulting in the PLL applying the cached MUL and DIV values to the PLL peripheral, changing the PLL frequency from 96 Mhz to 48 Mhz. But the USB clock divider is still /2, giving a 24 Mhz frequency, hence the USB device running at half speed.
The Fix
The fix involves having recalc_rate()
for the PLL use the cached values instead of querying the hardware. While this goes directly against the kernel documentation, it seems to sometimes happen. For example, the Renesas clock driver does this and seems to have gone through.
So, all that needs to be done is to use the MUL and DIV values from the PLL struct.
static unsigned long clk_pll_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
{
struct clk_pll *pll = to_clk_pll(hw);
- unsigned int pllr;
- u16 mul;
- u8 div;
-
- regmap_read(pll->regmap, PLL_REG(pll->id), &pllr);
-
- div = PLL_DIV(pllr);
- mul = PLL_MUL(pllr, pll->layout);
-
- if (!div || !mul)
- return 0;
- return (parent_rate / div) * (mul + 1);
+ return (parent_rate / pll->div) * (pll->mul + 1);
}
Further exploring shows that this was fixed years ago and was accepted, but later undone.
Now we plug it in and what do we get?
usb 1-1: new full-speed USB device number 3 using at91_ohci
usb 1-1: New USB device found, idVendor=0cf3, idProduct=9271
usb 1-1: New USB device strings: Mfr=16, Product=32, SerialNumber=48
usb 1-1: Product: USB2.0 WLAN
usb 1-1: Manufacturer: ATHEROS
usb 1-1: SerialNumber: 12345
Yay! Now that we have USB working, we can now submit this to the kernel, after which we will add networking and play with some packages to make this system more fun.
Contributing to the Linux Kernel
The process involved to contribute to the Linux kernel is not trivial, with you having to go through their mailing lists and being unable to use GMail. This should give a rough overview of how to go through the process so I don't forget myself, using my USB patch as an example. This is based on a few guides, some of which are this, this, and this, and some very helpful people on the IRC (especially gregkh).
Getting the Kernel
First things first, grab an up to date kernel. The kernel has a website showing the state of it's git based development. On there you can find what URL to use for doing a git clone.
But before you do that, keep in mind the kernel is huge, with many thousands of commits. Doing a simple git clone
takes a very long time on my machine, so you can instead do a "shallow" clone, which only takes the most recent commit for each file. This drastically speeds things up.
git clone --depth 1 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux
Creating the patch
We have to modify the recalc_rate()
function to use the cached DIV and MUL values. So in drivers/clk/at91/clk-pll.c
change the clk_pll_recalc_rate()
function as shown below. The "-" signs mean to remove the line, and the "+" signs mean to add the line. In our case we are removing most of the function body and replacing how the return value is calculated. Conceptually, this is all the information the kernel people need, but we still have to add some things.
static unsigned long clk_pll_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
{
struct clk_pll *pll = to_clk_pll(hw);
- unsigned int pllr;
- u16 mul;
- u8 div;
-
- regmap_read(pll->regmap, PLL_REG(pll->id), &pllr);
-
- div = PLL_DIV(pllr);
- mul = PLL_MUL(pllr, pll->layout);
-
- if (!div || !mul)
- return 0;
- return (parent_rate / div) * (mul + 1);
+ return (parent_rate / pll->div) * (pll->mul + 1);
}
Create a git commit with this change with a git add drivers/clk/at91/clk-pll.c
and then git commit --signoff
. The --signoff
flag adds a line to the end of the commit which includes your name and email for copyright purposes.
The title must specify what subsystem(s) this is for (in our case CLK and AT91), a short title. In our case, it will be "clk: at91: PLL recalc_rate() now using cached MUL+DIV values". Then write a proper git commit message detailing what the issue is and how it was fixed. Be specific and clear, this goes to developers who have tons of things to do and little time, so try to make this as painless as possible for them.
Then do a git format-patch HEAD~
to create a patch file (0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch
in our case). The contents of which is as follows:
[hak8or@hak8or linux_commit]$ cat 0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch
From 47ded631f3c787f00272d140dbb5ff1842e2716d Mon Sep 17 00:00:00 2001
From: Marcin Ziemianowicz <marcin@ziemianowicz.com>
Date: Sun, 29 Apr 2018 14:04:37 -0400
Subject: [PATCH V4] clk: at91: PLL recalc_rate() now using cached MUL and DIV values
When a USB device is connected to the USB host port on the SAM9N12 then
you get "-62" error which seems to indicate USB replies from the device
are timing out. Based on a logic sniffer, I saw the USB bus was running
at half speed.
The PLL code uses cached MUL and DIV values which get set in set_rate()
and applied in prepare(), but the recalc_rate() function instead
queries the hardware instead of using these cached values. Therefore,
if recalc_rate() is called between a set_rate() and prepare(), the
wrong frequency is calculated and later the USB clock divider for the
SAM9N12 SOC will be configured for an incorrect clock.
In my case, the PLL hardware was set to 96 Mhz before the OHCI
driver loads, and therefore the usb clock divider was being set
to /2 even though the OHCI driver set the PLL to 48 Mhz.
As an alternative explanation, I noticed this was fixed in the past by
87e2ed338f1b ("clk: at91: fix recalc_rate implementation of PLL
driver") but the bug was later re-introduced by 1bdf02326b71 ("clk:
at91: make use of syscon/regmap internally").
Fixes: 1bdf02326b71 ("clk: at91: make use of syscon/regmap internally)
Cc: <stable@vger.kernel.org>
Signed-off-by: Marcin Ziemianowicz <marcin@ziemianowicz.com>
---
Thank you for bearing with me about this Boris.
Changes since V3:
Fix for double returns found by kbluild test robot
> Comments by Boris Brezillon about email formatting issues
Changes since V2:
Removed all logging/debug messages I added
> Comment by Boris Brezillon about my fix being wrong addressed
Changes since V1:
Added patch set cover letter
Shortened lines which were over >80 characters long
> Comment by Greg Kroah-Hartman about "from" field in email addressed
> Comment by Alan Stern about redundant debug lines addressed
drivers/clk/at91/clk-pll.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/drivers/clk/at91/clk-pll.c b/drivers/clk/at91/clk-pll.c
index 7d3223fc..72b6091e 100644
--- a/drivers/clk/at91/clk-pll.c
+++ b/drivers/clk/at91/clk-pll.c
@@ -132,19 +132,8 @@ static unsigned long clk_pll_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
{
struct clk_pll *pll = to_clk_pll(hw);
- unsigned int pllr;
- u16 mul;
- u8 div;
-
- regmap_read(pll->regmap, PLL_REG(pll->id), &pllr);
-
- div = PLL_DIV(pllr);
- mul = PLL_MUL(pllr, pll->layout);
-
- if (!div || !mul)
- return 0;
- return (parent_rate / div) * (mul + 1);
+ return (parent_rate / pll->div) * (pll->mul + 1);
}
static long clk_pll_get_best_div_mul(struct clk_pll *pll, unsigned long rate,
--
2.17.0
There are a few things to note here.
-
The commit title starts with
[PATCH V4]
. When runninggit format-patch HEAD~
,[PATCH]
gets put into the subject line of the patch file. Since in my case this was the 4th attempt for the patch, each attempt includes a version that you must put in manually into the patch file. -
Notes get put between two
---
lines. These notes are meant for the mailing list and do not get put into the commit message. In my case it shows changes between each patch version and a short note thanking Boris for putting up with me and my issues with getting this right. :P -
The
From:
field is the same as theSigned-off-by
field. Ensure this is the same! -
Since this was traced back to a bug in a commit done a long time ago, the commit and commit title is added in a
Fixes
field. Furthermore, aCC
field was added so when we dogit send-email
later, it will get automatically added to the list of people to CC the email too. The CC entry is also to (from what I understand) let kernel maintainers know that it can be back ported to past kernels. -
Past commits are referenced by their commit hash, not a link to LKML or github or anything else, just a commit hash and the title of the commit.
Verfying
There are multiple tools to ensure the changes are following kernel guidelines. One way is to use the checkpatch.pl
script which in our case gives an error but it's probably fine since it's a URL.
[hak8or@hak8or linux]$ ./scripts/checkpatch.pl --strict --codespell ../0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch
No codespell typos will be found - file '/usr/share/codespell/dictionary.txt': No such file or directory
total: 0 errors, 0 warnings, 0 checks, 20 lines checked
../0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch has no obvious style problems and is ready for submission.
You can also run the script on the file itself, which shows issues but unrelated to our change. If you want to fix these errors, ensure that they are properly split into multiple different commits.
[hak8or@hak8or linux]$ ./scripts/checkpatch.pl -f drivers/clk/at91/clk-pll.c
WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#1: FILE: drivers/clk/at91/clk-pll.c:1:
+/*
WARNING: line over 80 characters
#103: FILE: drivers/clk/at91/clk-pll.c:103:
+ characteristics->icpll[pll->range] << PLL_ICPR_SHIFT(id));
ERROR: open brace '{' following function definitions go on the next line
#139: FILE: drivers/clk/at91/clk-pll.c:139:
+static long clk_pll_get_best_div_mul(struct clk_pll *pll, unsigned long rate,
+ unsigned long parent_rate,
+ u32 *div, u32 *mul,
+ u32 *index) {
total: 1 errors, 2 warnings, 519 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or --fix-inplace.
drivers/clk/at91/clk-pll.c has style problems, please review.
NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Who to send to
The Linux kernel development ecosystem relies heavily on mailing lists. No, this isn't like Github where you get fancy shmancy commenting, emoji's, or even doing pull requests. This is the old fashioned way which seems to work best for them, over email.
The kernel has many developers, all of whom reside in their respective mailing list. The mailing lists are split up in various catagories you can find here, such as linux-pm
for Linux Power Management or linux-clk
for Linux Clocking.
Thankfully there is a script to help you find out who to email your changes to. In our case, since this is a single file change, simply running the get_maintainer.pl
script on our changed file suffices.
[hak8or@hak8or linux]$ ./scripts/get_maintainer.pl ../0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch
Boris Brezillon <boris.brezillon@bootlin.com> (maintainer:ARM/ATMEL AT91 Clock Support)
Michael Turquette <mturquette@baylibre.com> (maintainer:COMMON CLK FRAMEWORK)
Stephen Boyd <sboyd@kernel.org> (maintainer:COMMON CLK FRAMEWORK)
Nicolas Ferre <nicolas.ferre@microchip.com> (supporter:ARM/Microchip (AT91) SoC support)
Alexandre Belloni <alexandre.belloni@bootlin.com> (supporter:ARM/Microchip (AT91) SoC support)
linux-clk@vger.kernel.org (open list:COMMON CLK FRAMEWORK)
linux-arm-kernel@lists.infradead.org (moderated list:ARM/Microchip (AT91) SoC support)
linux-kernel@vger.kernel.org (open list)
The kernel also makes use of the TO
and CC
fields in emails, with the first being for people most directly associated and BCC for everyone else. In our case, the divide will be like this:
to:
Boris Brezillon <boris.brezillon@free-electrons.com>, Nicolas Ferre <nicolas.ferre@microchip.com>, Alexandre Belloni <alexandre.belloni@bootlin.com>
CC:
Michael Turquette <mturquette@baylibre.com>, Stephen Boyd <sboyd@kernel.org>, linux-clk@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org
Sending
Great, so we know who to send it to. What about actually sending it? We can't really use GMail since it seems to break our formatting. Instead, git send-email
will be used. Sure, you can use Mutt, but it was very error prone in my experience. Configuring it is pretty straight forward, with this being my ~/.gitconfig
.
[hak8or@hak8or linux]$ cat ~/.gitconfig
[user]
email = marcin@ziemianowicz.com
name = Marcin Ziemianowicz
[core]
editor = code --wait --new-window
[sendemail]
smtpuser = marcin@ziemianowicz.com
smtpPass = Put_your_smtp_pass_here
smtpserver = smtp.zoho.com
smtpencryption = tls
smtpserverport = 587
First send an email just to yourself (just git send-email
with the patch file and no email), with an empty TO field (CC gets populated with your email). Do this to ensure the format is correct, there are no typo's, and a last check to ensure you didn't miss something in your bugfix.
Lastly, send out the actual patch!
git send-email \
../0001-clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch \
--cc='Michael Turquette <mturquette@baylibre.com>' \
--cc='Stephen Boyd <sboyd@kernel.org>' \
--cc='linux-clk@vger.kernel.org' \
--cc='linux-arm-kernel@lists.infradead.org' \
--cc='linux-kernel@vger.kernel.org' \
--to='Boris Brezillon <boris.brezillon@free-electrons.com>' \
--to='Nicolas Ferre <nicolas.ferre@microchip.com>' \
--to='Alexandre Belloni <alexandre.belloni@bootlin.com>' \
--to='Greg Kroah-Hartman <gregkh@linuxfoundation.org>'
Here is this commit in the wild on the LKML website which is great for tracking your messages. You can also use PatchWork which isn't bad either.
Replying to Mailing List
You sent it out to the kernel mailing list, and you will definitely get feedback. We can't really use GMail for this, so might as well use Mutt. Thankfully it's settings are fairly similar to the ones used for git send-email
. I won't go over how to configure it (which was a pain), but here is what to put in your ~/.muttrc
to have it work with Zoho Mail.
[hak8or@hak8or linux]$ cat ~/.muttrc
set envelope_from=yes
set realname = 'Marcin Ziemianowicz'
set from="Marcin Ziemianowicz <marcin@ziemianowicz.com>"
set use_from=yes
set edit_headers=yes
set smtp_url = "smtps://marcin@ziemianowicz.com@smtp.zoho.com"
set smtp_pass = "Put your smtp_pass here!"
set ssl_force_tls = yes
set folder = imaps://imappro.zoho.com:993
set imap_user = marcin@ziemianowicz.com
set imap_pass = Put_your_smtp_pass_here
set spoolfile = +INBOX
mailboxes = +INBOX
# Store message headers locally to speed things up.
# If hcache is a folder, Mutt will create sub cache folders for each account which may speeds things up even more.
set header_cache = ~/.cache/mutt
# Store messages locally to speed things up, like searching message bodies.
# Can be the same folder as header_cache.
# This will cost important disk usage according to your e-mail amount.
set message_cachedir = "~/.cache/mutt"
# Specify where to save and/or look for postponed messages.
set postponed = +[stuff]/Drafts
# Allow Mutt to open a new IMAP connection automatically.
unset imap_passive
# Keep the IMAP connection alive by polling intermittently (time in seconds).
set imap_keepalive = 300
# How often to check for new mail (time in seconds).
set mail_check = 120
Reply to everyone who messages you via the g
key when having an email open, and reply to people inside the quotes. For example,
> Some long message from a great
> kernel maintainer goes here.
>
> > Looks like they quoted yet someone else!
> > These are baiscally nested quotes.
>
> But here they ask a question?
So you reply here! See the lack of > characters?
The > character is considered to be a quote from someone else.
> And some other text from the maintainer
> goes here.
This is suprisingly readable actually, much better than expected.
Summary
Yes, this is a jarring experience if you are used to the pull request systems implimented in Github or GitLab, but it actually works. Linux has been using this method for many many years and it's still going, so clearly it's doing something right. Not to mention, it seems to scale well too based on how many commits there are per day. But it still has a decent learning curve and tons of little nit-picks which don't seem to fully documented in one place.
Lastly, here is my patch on the LKML website, and here is the patchwork version.
Networking
Now that we have the USB issue solved, we have to ensure that the driver for our dongle is being included as shown earlier. This can be verified by make linux-nconfig
and under Device Drivers->Network Device Support->Wireless Lan->Atheros/Qualcom devices
being enabled, and support for Atheros HTC based wireless cards
also being checked. The firmware itself for our dongle must also be enabled in buildroot via Target Packages->Hardware Handling->Firmware->Linux-firmware->Wifi firmware->Atheros 9271
. The process is the same for other WiFi dongles. When you plug the dongle in, you should be seeing the following in dmesg or the terminal:
usb 1-1: new full-speed USB device number 3 using at91_ohci
usb 1-1: New USB device found, idVendor=0cf3, idProduct=9271
usb 1-1: New USB device strings: Mfr=16, Product=32, SerialNumber=48
usb 1-1: Product: USB2.0 WLAN
usb 1-1: Manufacturer: ATHEROS
usb 1-1: SerialNumber: 12345
....
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3 at drivers/usb/core/urb.c:471 usb_submit_urb+0x24c/0x488
usb 1-1: BOGUS urb xfer, pipe 1 != type 3
Modules linked in:
CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G W 4.15.18 #4
Hardware name: Atmel AT91SAM9
Workqueue: events request_firmware_work_func
[<c010734c>] (unwind_backtrace) from [<c0105344>] (show_stack+0x10/0x14)
[<c0105344>] (show_stack) from [<c010e364>] (__warn+0xd4/0xec)
[<c010e364>] (__warn) from [<c010e3b0>] (warn_slowpath_fmt+0x34/0x44)
[<c010e3b0>] (warn_slowpath_fmt) from [<c02e8874>] (usb_submit_urb+0x24c/0x488)
[<c02e8874>] (usb_submit_urb) from [<c02d7e44>] (hif_usb_send+0x268/0x2b8)
[<c02d7e44>] (hif_usb_send) from [<c02d854c>] (ath9k_wmi_cmd+0x124/0x178)
[<c02d854c>] (ath9k_wmi_cmd) from [<c02dd364>] (ath9k_regwrite+0xd8/0xdc)
[<c02dd364>] (ath9k_regwrite) from [<c02b7994>] (ath9k_hw_init_pll+0x2b8/0x56c)
[<c02b7994>] (ath9k_hw_init_pll) from [<c02b9398>] (ath9k_hw_disable+0x40/0x48)
[<c02b9398>] (ath9k_hw_disable) from [<c02ddce4>] (ath9k_htc_probe_device+0x6fc/0x870)
[<c02ddce4>] (ath9k_htc_probe_device) from [<c02d6738>] (ath9k_htc_hw_init+0x10/0x30)
[<c02d6738>] (ath9k_htc_hw_init) from [<c02d78d4>] (ath9k_hif_usb_firmware_cb+0x54c/0x5f4)
[<c02d78d4>] (ath9k_hif_usb_firmware_cb) from [<c028d004>] (request_firmware_work_func+0x38/0x60)
[<c028d004>] (request_firmware_work_func) from [<c011f6a4>] (process_one_work+0x1b8/0x2fc)
[<c011f6a4>] (process_one_work) from [<c0120218>] (worker_thread+0x2b0/0x428)
[<c0120218>] (worker_thread) from [<c0123f14>] (kthread+0xfc/0x114)
[<c0123f14>] (kthread) from [<c01024e0>] (ret_from_fork+0x14/0x34)
---[ end trace 58ebef53bfa50e07 ]---
....
Ugh, more issues?
What the heck is this usb 1-1: BOGUS urb xfer, pipe 1 != type 3
doing spamming our console? Looking around, there has been some work on this, and someone else getting this issue. Sadly, there is no trivial proper fix for this.
This requires a little setup first though! When writing this guide, I had only two dongles on hand, an OurLink AC600 that is based on an 8812au
which only has an OOT (Out Of Tree or not mainlined into Linux) driver. There are no plans to upstream the 8812au relevant code because the driver quality is apparently very poor and there isn't enough interest to clean it up, especially considering it's 5 years old at this point. Instead, I am using the RNX-N150HG which is based off the Atheros 9271. Unfortunately, the driver doesn't currently support USB Full Speed, which is all the SAM9N12 can muster. You say it's USB 2.0 though, it should therefore do Full Speed! Well, that's marketing for ya, calling something USB 2.0 which should therefore support USB High-Speed (480 Mbps) when it actually can only handle USB Full-Speed (12 Mbps).
Well, the driver says it's not supported to run at Full-Speed, not that it doesn't work. In my experience it seems to work 9/10th of the time when booting, so good enough for me. Instead, let's work on getting rid of usb 1-1: BOGUS urb xfer, pipe 1 != type 3
spam we get in the dongle. We can just disable all logging via dmesg -n 1
but then we loose other potentially relevant information. Instead, we can change dev_WARN(..)
to dev_warn_once(...)
in drivers/usb/core/urb.c
where this issue happens, which will make the warning show up only once. We could have used dev_warn_ratelimited(...)
instead, to rate limit the warning so it only shows up at most 10 times every 5 seconds, but I found this to still spam the logs too much.
[hak8or@hak8or linux_commit]$ cat 0001-USB-Bogus-Pipe-warning-rate-limited.patch
From 41595efeebbae49555ac1917d0adaee98d1fa4ee Mon Sep 17 00:00:00 2001
From: Marcin Ziemianowicz <marcin@ziemianowicz.com>
Date: Tue, 1 May 2018 22:54:52 -0400
Subject: [PATCH] USB: Bogus Pipe warning rate limited
Fix for just my board to prevent errors from overflow our logs. This
error is not critical but happens extremely often due to a driver issue
which I am fine with. This will never be mainlined since it's a hack.
---
drivers/usb/core/urb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c
index f51750bc..17ef20fc 100644
--- a/drivers/usb/core/urb.c
+++ b/drivers/usb/core/urb.c
@@ -475,7 +475,7 @@ int usb_submit_urb(struct urb *urb, gfp_t mem_flags)
/* Check that the pipe's type matches the endpoint's type */
if (usb_urb_ep_type_check(urb))
- dev_WARN(&dev->dev, "BOGUS urb xfer, pipe %x != type %x\n",
+ dev_warn_once(&dev->dev, "BOGUS urb xfer, pipe %x != type %x\n",
usb_pipetype(urb->pipe), pipetypes[xfertype]);
/* Check against a simple/standard policy */
--
2.17.0
Now our boot log looks much cleaner, though we did loose out on the stack trace, but we know where this is coming from anyways so oh well.
usb 1-1: new full-speed USB device number 2 using at91_ohci
usb 1-1: New USB device found, idVendor=0cf3, idProduct=9271
usb 1-1: New USB device strings: Mfr=16, Product=32, SerialNumber=48
usb 1-1: Product: USB2.0 WLAN
usb 1-1: Manufacturer: ATHEROS
usb 1-1: SerialNumber: 12345
usb 1-1: ath9k_htc: Firmware ath9k_htc/htc_9271-1.4.0.fw requested
usb 1-1: ath9k_htc: Transferred FW: ath9k_htc/htc_9271-1.4.0.fw, size: 51008
usb 1-1: BOGUS urb xfer, pipe 1 != type 3
ath9k_htc 1-1:1.0: ath9k_htc: HTC initialized with 33 credits
ath9k_htc 1-1:1.0: ath9k_htc: FW Version: 1.4
ath9k_htc 1-1:1.0: FW RMW support: On
ieee80211 phy0: Atheros AR9271 Rev:1
WPA Supplicant
Next comes being able to connect to a secure network (WPA2 in my case). For that you need a Supplicant which can handle WPA, in our case being WPA Supplicant. Sadly it is huge at a whopping 408 kB, and after fiddling with it's defconfig
I wasn't able to find any way to greatly and easily reduce it's size. It also doesn't support kconfig and therefore menuconfig, so the attempts were mostly trial and error due to the lack of dependancy information. Enable it in buildroot via a make nconfig
, and configuring it can be done as per a few great guides, like this from LFS, this from gentoo, or this from, of course, Arch. Lastly, we need iw
(only 52 kB
) because it is extremely helpful for interfacing with wireless networks, and dhcpd (only 92 kB
) to get an IPv4 address.
[hak8or@CT108 buildroot-2018.02.1]$ ls -la --block-size=k output/images/
total 3305K
drwxr-xr-x 2 hak8or hak8or 1K May 2 05:21 .
drwxr-xr-x 6 hak8or hak8or 1K May 2 03:25 ..
-rw-r--r-- 1 hak8or hak8or 18K May 2 05:21 at91sam9n12ek_custom.dtb
-rw-r--r-- 1 hak8or hak8or 1476K May 2 05:35 rootfs.squashfs
-rw-r--r-- 1 hak8or hak8or 1748K May 2 05:21 zImage
To connect to a WiFi network we need to create the WiFi login key and then tell wpa_supplicant to actually connect.
# Create a file containing the login credentials.
wpa_passphrase OpenWrt ssidpassword > /tmp/w.conf
# Connect to the network using the login credentials.
wpa_supplicant -B -i wlan0 -c /tmp/w.conf
# Get a IPv4 address using dhcpd if it didn't fetch one automatically.
dhcpd
And now we have a connection!
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: sit0@NONE: <NOARP> mtu 1480 qdisc noop qlen 1000
link/sit 0.0.0.0 brd 0.0.0.0
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq qlen 1000
link/ether 68:1c:a2:01:17:0b brd ff:ff:ff:ff:ff:ff
inet 192.168.1.227/24 brd 192.168.1.255 scope global wlan0
valid_lft forever preferred_lft forever
# ping hak8or.com
PING hak8or.com (107.191.39.171): 56 data bytes
64 bytes from 107.191.39.171: seq=0 ttl=50 time=21.703 ms
64 bytes from 107.191.39.171: seq=1 ttl=49 time=24.823 ms
64 bytes from 107.191.39.171: seq=2 ttl=49 time=19.870 ms
64 bytes from 107.191.39.171: seq=3 ttl=50 time=24.777 ms
^C
--- hak8or.com ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 19.870/22.793/24.823 ms
CRDA
What about the mention of cfg80211: failed to load regulatory.db
? Well, after a decent bit of googling, turns out this is related to regulatory issues of what areas are allowed to use what channels for Wifi. A recent update to the Linux kernel resulted in needing to use crda
(which is a monsterous 320 kB) just to communicate with the kernel this regulatory information.
# Boot log ...
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
Loading compiled-in X.509 certificates
cfg80211: Loading compiled-in X.509 certificates for regulatory database
cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
cfg80211: failed to load regulatory.db
# iw reg get
global
country 00: DFS-UNSET
(2402 - 2472 @ 40), (6, 20), (N/A)
(2457 - 2482 @ 20), (6, 20), (N/A), AUTO-BW, PASSIVE-SCAN
(2474 - 2494 @ 20), (6, 20), (N/A), NO-OFDM, PASSIVE-SCAN
(5170 - 5250 @ 80), (6, 20), (N/A), AUTO-BW, PASSIVE-SCAN
(5250 - 5330 @ 80), (6, 20), (0 ms), DFS, AUTO-BW, PASSIVE-SCAN
(5490 - 5730 @ 160), (6, 20), (0 ms), DFS, PASSIVE-SCAN
(5735 - 5835 @ 80), (6, 20), (N/A), PASSIVE-SCAN
(57240 - 63720 @ 2160), (N/A, 0), (N/A)
By default we can seem to access all channels, and the device does currently work, so good enough for me. Therefore, we won't go over installing CRDA and getting it to work, partly because I also wasn't able to get it to work.
Next up we will install some packages and play around a bit.
Making things better
The end goal is to have the system be fairly usable, with the list from before being:
- Boot to a shell and be able to communicate with it over serial
- Read only file system with compression (SquashFS)
- Networking support
- Use the RNX-N150HG USB Wifi dongle to talk to the outside world
- curl
- htop
- stress
- tmux
- SSH Server
- Nano
As of now, the system can boot into a shell and recognize USB devices, specifically our Wifi dongle. Future changes to the system involve userspace instead of kernel space, hence the divide. We will call this a minimal configuration for our system, which if you want to replicate just follow the replicating documentation.
Musl vs Glibc vs uClibc-ng
There are three main c standard library implementations out there, each with it's own pro's and con's. Glibc (GNU C Library) is the big most popular and therefore the "standard". uClibc-ng (a fork of uClibc) a smaller version of GlibC which removes various (very non relevant for Embedded Systems) backwards compatibility in favor of space. Musl is a new implementation of the C STDLib under a more permissive and open source friendly license, in addition to being an attempt at writing a new implementation with modern practices in mind. There is a great comparison which goes over specific differences between these three implementations. This can be changed in buildroot in the Toolchain
entry. Since we are targeting a very small system in terms of flash, we have to see which of these is smaller. This is without any packages yet, so it is really just the size of the stdlib itself.
Type | zImage Total | RootFS Total | RootFS Delta |
---|---|---|---|
uClibc-ng | 1658 kB | 964 kB | 0 kB |
Glibc | 1657 kB | 1720 kB | +760 kB |
Musl | 1658 kB | 1036 kB | +76 kB |
If we just worry about size then you would think, hey, lets go with uClibc-ng, right? To that I say, good luck getting locale to work, and therefore tmux. Instead, in our case we will go with Musl. It's very well supported, has tons of functionality similar to Glibc (refer to previously mentioned comparison chart), and is only 76 kB bigger than uClibc-ng. Most importantly, after spending a few hours trying to get locales to run with uClibc-ng to get tmux to work, I gave up and went with Musl. So Musl it is!
HTTP Webserver
Buildroot supports a few HTTP webservers, such as Nginx, but we need a small one that can fit in our tiny flash space. Here are a few I looked at and their associated size increase of the zImage.
Package | Summary | RootFS Delta |
---|---|---|
lighthttpd | Very active even today, lots of features. | 212 kB |
Nginx | Everyone knows nginx, this removed everything except static file hosting. | 146 kB |
uhttpd | Written by OpenWRT people, handles CGI and IPv6. | 56 kB |
thttpd | Last commit was in 2014, simple, and supports CGI + IPv6. | 36 kB |
Boa | Discontinued in 2005. | 20 kB |
DarkHttp | Last commit was in 2016, simple, no CGI, and supports IPv6. | 60 kB |
tinyhttpd | Many versions under this name, shouldn't be used for production. Ridiculously simple and tiny. | 1 kB |
Packages
Buildroot also does a fantastic job of dependancy management. It can tell what packages need what libraries or other software and adds it for you. To start off, lets add htop, a great alternative to top which adds colors and just makes visually far more usable. In the buildroot directory, just do make nconfig
and go into target packages->System tools
to enable the htop package. You can also just search for symbols when using nconfig with the F8 key. Afterwards just run make and done. Htop adds 156 kB to our compressed root file system.
Repeating this process for the other packages, here is what the size of each package is when added to our root file system.
Package | Summary | RootFS Delta |
---|---|---|
WPA_Supplicant | To connect to wireless networks. | 408 kB |
Tmux | Great for when we want to need two or more terminals at once. | 336 kB |
Htop | Vastly prefer over top, used to tell what the state of the system is. | 156 kB |
LibCurl + Curl | Interfacing with web API's. | 148 kB |
Dropbear + Zlib | Allows us to run an SSH server on our board. Requires Zlib (only 32 kB), 76 kB with or without "Client Programs". | 108 kB |
Nano | Helpful little text editor. | 60 kB |
Dhrystone | Can be fun to use for very rough benchmarking. | 4 kB |
Stress | Stress test the system for IO, CPU, Memory, etc. Sadly can't use stress-ng because we aren't using GlibC due to it's size. | 4 kB |
Tmux
Tmux requires a UTF-8 locale instead of our current ASCII locale. What is a locale you ask? It is a bunch of information in a file which tells the system what region you are in and therefore what symbols to use for your currency, how to display your time, how to display numbers (comma vs dot for digit groupings), and other formatting which differs across countries. This also tends to be accompanied with the characters themselves.
If tmux is ran without the proper locale setup, then you are greeted with this tmux: need UTF-8 locale (LC_CTYPE) but have ASCII
.
Drop Bear
Since we have network connectivity, we might as well include the ability to connect to the device over SSH, and connect to other devices over SSH. SSH'ing into a system requires either key or password based authentication, but that gives us an issue. Our root file system is read only, and we do not have a overlay to allow writing to it. Therefore, we cannot just create a new user in the running file system or add a password, because both require writing to the file system. Instead, we can have buildroot add a password to the root user under System configuration->Root password
. Right now it's set to "pass".
Replicating
To replicate the build this writeup uses, including the specific version of the kernel (4.15.18) and patches for the USB device, here are various files you need. The output size may change slightly due to the GCC toolchain being used by buildroot changing, since it's only specified to use GCC 7.X instead of a specific GCC. Packages such as htop and whatnot are also not locked to a version, so they too may change in size as time goes on.
It is assumed the base directory for this is /home/hak8or
. If this should be modified, then make sure to adjust the absolute paths in the various defconfig
files. There are two versions of the system.
-
A "minimal" system which boots into a shell with Musl as the c standard library, recognizes the USB WiFi dongle, and applies patches to the kernel for having the dongle work. Additionally it has wpa_supplicant and the dongle firmware installed so a network connection can be setup.
-
A "full" version which is on top of the "minimal" version while including packages like htop, curl, tmux, and others. Root user has a password of "pass" which lets people ssh in as the root user.
Both of these versions sit under their own "minimal" or "full" folder. This replication guide will assume the "minimal" version.
AT91 Bootstrap
To build bootstrap, here is an abbreviated version of what was written in the guide.
# Get at91 bootstrap
git clone https://github.com/linux4sam/at91bootstrap.git
# Enter the cloned repo
cd at91bootstrap
# Get the defconfig. Sadly make BrainyV2_AT91Bootstrap_defconfig doesn't work, so we
# have save it as a .config file.
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/BrainyV2_AT91Bootstrap_defconfig -o .config
# Ensure we have a toolchain installed
yaourt -S arm-none-eabi-gcc
# Say there are only 4 banks in the DRAM IC we are using instead of 8.
sed -i 's/AT91C_DDRC2_NB_BANKS_8/AT91C_DDRC2_NB_BANKS_4/g' board/at91sam9n12ek/at91sam9n12ek.c
# Build bootstrap, flash-able bin is in binaries/boot.bin
make CROSS_COMPILE=arm-none-eabi-
You can also just download the binary itself via wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/BrainyV2_AT91Bootstrap.bin
.
Buildroot + Linux
First we get all the dependencies.
# Get buildroot
wget https://buildroot.org/downloads/buildroot-2018.02.1.tar.gz
tar -xzf buildroot-2018.02.1.tar.gz
# Get the buildroot configuration file for the minimal system. To get the full system, just use BrainyV2_buildroot_full_defconfig instead.
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/BrainyV2_buildroot_minimal_defconfig
# Get the linux configuration file
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/BrainyV2_kernel_defconfig
# Get the device tree
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/BrainyV2.dts
# Get the patch for USB clock being wrong.
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/clk-at91-PLL-recalc_rate-now-using-cached-MUL-and-DI.patch
# Get the patch for USB bogus pipe error.
wget https://brainyv2.hak8or.com/AT91SAM9N12/configs/USB-Bogus-Pipe-warning-rate-limited.patch
Then to build this, a simple make
in the buildroot folder should suffice. The buildroot defconfig should pull in the device tree, apply the USB clock patch, and compile the kernel with the custom defconfig. The output should be in output/images
. If you have issues, ensure the paths in the various defconfig
's is accurate.
# Enter buildroot dir
cd buildroot-2018.02.1
# Copy over the defconfig
make defconfig BR2_DEFCONFIG=/home/hak8or/BrainyV2_buildroot_minimal_defconfig
# Build our system!
make
Output size
[hak8or@CT108 buildroot-minimal]$ ls -la --block-size=k output/images/
total 3234K
drwxr-xr-x 2 hak8or hak8or 1K May 3 04:58 .
drwxr-xr-x 6 hak8or hak8or 1K May 3 04:39 ..
-rw-r--r-- 1 hak8or hak8or 18K May 3 04:58 BrainyV2.dtb
-rw-r--r-- 1 hak8or hak8or 1408K May 3 05:14 rootfs.squashfs
-rw-r--r-- 1 hak8or hak8or 1749K May 3 04:58 zImage
[hak8or@CT108 buildroot-minimal]$ cd ../buildroot_full
[hak8or@CT108 buildroot_full]$ ls -la --block-size=k output/images/
total 4015K
drwxr-xr-x 2 hak8or hak8or 1K May 3 18:52 .
drwxr-xr-x 6 hak8or hak8or 1K May 3 18:16 ..
-rw-r--r-- 1 hak8or hak8or 18K May 3 18:51 BrainyV2.dtb
-rw-r--r-- 1 hak8or hak8or 2200K May 3 18:52 rootfs.squashfs
-rw-r--r-- 1 hak8or hak8or 1749K May 3 18:51 zImage
You can also find the binaries in the configs/full
or configs/minimal
folders, with each folder including the bootloader, the device tree binary, the kernel zImage, and the root file system.
New defconfig
Most packages in buildroot, including buildroot itself, have the ability to save their menuconfig state via a make {pkg_name}_savedefconfig
command. Buildroot itself is a bit diffirent, for example doing make savedefconfig
doesn't seem to create a defconfig file anywhere. Instead, doing make savedefconfig BR2_DEFCONFIG=defconfig
creates a defconfig file at the root of the buildroot directory.
On the other hand, saving the Linux configuration can be done by make linux-savedefconfig
with the file created in output/build/linux-4.15.18/defconfig
.
Final Thoughts
The goal was to make an embedded Linux system from scratch. Originally it was planned to use the NAND Flash IC (S34ML04G200TFI000
at 512 MB) on the board, but sadly it doesn't work for some reason. Likely due to BGA soldering issues. Since that didn't work Dataflash was used instead, in which case it made this project more interesting on the software side due to having a size requirement.
For storage, the root file system had to fit in under 2.217 MB
, which when including all the packages above (except HTTP servers), we fit at 2.148 MB
. So, in under 4 MB of flash we were able to fit in the kernel including drivers for our WiFi dongle, and a root file system with tons of networking things built in. A big reason why we were able to do this though is because the file system is both read only and using intense compression. When booting the system you can tell it takes a bit of time for the boot to finish at one point (~2 second hiccup). This is due to the slow 400 Mhz ARM926EJ based core having to load the root file system over a 33 Mhz SPI bus from Data Flash and then extract it.
If you noticed, there are mentions of BrainyV2 sprinkled around. As you can guess, this is the second version of my "Brainy" series of projects. The first one was my first every 4 layer PCB and first foray into BGA IC's and signals faster than those present in a 120 Mhz ARM MCU.
It was a total disaster in terms of assembly (footprint for the BGA IC was way too small) and expensive to reproduce (~$20 worth of components for a ~25 PCB and many hours of assembly). Modularity was king at this point, so if a design has to be redone it only needs the most likely part to fail (main logic board with SOC + DRAM) to be discarded, saving money and time.
OSH Park (the fab I use) can at best do 4 layer PCB's with BGA pitch of 0.8 MM due to the trace width and space being at minimum 5 mil. If the BGA pitch is less than 0.8 MM then you cannot put traces between the pads on the PCB, and therefore being unable to fanout traces more than 1 layer deep. Thankfully there are still many BGA IC's out there that have a ball pitch of 0.8 mm.
All in all, this was a great project. This totally satisfied my need of making a Linux system from scratch and learning lots on the way. My one gripe is how much of a pain it is to assemble these chips, it's easily half a day of work to put one board together, and the likely hood of failure during assembly is high.
Brainy V3 and onwards
Even though assembly of these boards is a pain, and I am not willing to pay a few hundred to outsource assembly on a potentially bad design, I want to make more. Brainy V3 is composed of two designs, each using a more modern SOC, one from Allwinner (V3s) and the other from Freescale (SoloLite). The V3s design has an SDIO based WiFi solution embedded into the PCB too! The PCB design is complete but has to be fixed (pads for BGA were too small again, silk is bad in areas, etc) before assembly and testing is done.
For Brainy V4, I will likely use a PCB fab which can handle BGA ball pitches of 0.65 MM, which gives me access to much more modern SOC's like the RK3399. This beast has two 64 bit A72 cores running at 1.8 Ghz and 4 A53 cores running at 1.4 Ghz, Display Port, HDMI, 2x 4 Lane MIPI DSI, PCI-E 4x Lane, and can handle a 64 bit memory interface to LPDDR4 RAM. It's accessible via Taobao and it's documentation is online, including various designs which can be used as reference.
Getting started
Right now, you have to manually find and download all the needed dependancies. A script will hopefully be written someday to automate downloading and patching all the files.
Use the Atmel SAM9N12 linux4sam page for a general overview of the build process.
Status
A few years later this board was brought up again with the intention of using the most recent versions of mainline linux and new tools to help streamline the process. This can be read here How to set up the board using the old way via this readme has been kept because it shows how to set everything up by hand and therefore still presents useful information.
Dataflash (like SPI NOR flash) is attached to the SPI bus from the chip to the SPI bus pads while also using it's own board. DRAM is also underclocked to 100 Mhz instead of 133 Mhz via the main system bus downclock, causing the processor to run at 300 Mhz instead of 400 Mhz. AT91 Bootstrap and U-Boot are located on dataflash at 0x00
and 0x8400
respectively, with U-Boot pulling the kernel from a flash drive connected via USB OTG as well. The kernel then pulls the rootfs off the flash drive in a dedicated ext2 rootfs as rw (read write). GCC has been cross compiled to this board and compiles programs correctly, so this board was used for completion of the project.
Boot process
- NVM bootloader: Primary bootloader which searches for executable code via Arm exception vectors on NAND and Dataflash and elsewhere except USB.
- AT91bootstrap: Secondary bootloader which setups up DRAM and puts the next executable code (again, via ARM exception vectors) into DRAM.
- U-Boot: Third bootloader for loading the linux kernel off USB and into memory while passing proper kernel boot arguments.
- Linux Kernel: Indended application, runs the rootfs off USB (not copied into memory).
Overview
The NVM bootloader exists in ROM on the SAM9N12 and is the first thing executed upon powerup. This searches in for possible bootable storage mediums such as NAND, Dataflash, SPI, and others except USB, as well as sets up the serial port. If nothing was found then it starts up SAM-BA, which lets the SAM-BA client on a desktop to issue commands to the SAM9N12 over USB, this is used for writing both AT91 bootstrap and U-Boot to onboard dataflash.
Next up is the AT91 bootloader which does pretty much the same as the NVM bootloader but also sets up DRAM. Keep in mind that even if DRAM seems to work via SAM-BA, it does NOT mean that it will work also via the at91 bootloader, memory timings were needed to be modified (relaxed) and clock lowered by 33 Mhz to get DRAM working. This sits in Dataflash at 0x00
which in this case is a 8 megabit chip. Some memory testing routines were added to this. Keep in mind that the endianess of this is NOT the same as most x86 systems, nibble swapping is done in the memory testing routines when printing.
U-Boot sits in dataflash at 0x8400
and sets up USB host, reads the DOS partition table for the first partition (FAT) loads uiamge.bin
which is the kernel into memory at 0x00
, and lastly passes control to the kernel as well as passing boot arguments which are added via compile time. MDT-tools can be used for passing storage mediums information to the kernel but I didn't use it since it just sucks. Enviroment variables such as boot media can be supplied via a text file on USB but it seems we can't initialize USB before reading this file.
The linux kernel does it's magic setting up cache and the MMU and all that jazz, and then loads the root filesystem from USB partition 2 as an ext2 filesystem but ext4 could be used if enabled in the kernel during compilation. The kernel then looks for /bin/init
which baisically tells busybox to call /etc/inittab
that tells the system what to do upon a restart/shutdown/ctrl-alt-del/respawn/sysinit. Sysinit tells to check /etc/init.d/rcS
which handles telling the kernel to fill /proc and /sys. A small 5 megabyte ramfs is also made at this step as per rcS
, afterwards the ash shell is started which is a very small alternative to bash while lacking some tools.
Root filesystem.
Busybox is a very awesome tool which lets you combine tools like ls, cat, mount, ln, etc into one executable for memory savings and simplicity. Busybox during make install
makes an _install dir that holds the rootfs containing symbolic links to the single busybox executable, allowing calling these tools normally. The single executable can also just be copied into /bin without any symbolic links (assuming static compilation) with the tools called by doing busybox tool
. Busybox can also handle an inittab for you, but it assumes that there are serial ports which aren't there, which spams the serial port every 250 milliseconds saying it didn't find them. Supplying an inittab without those serial port declerations fixes this.
A small 5 megabyte ram file system is made via rcS but it isn't needed and is a remenant of when I used to mount the rootfs as read only. You can write to it using dd and whatnot to do rough memory tests or writing quickly changing data without wearing down flash.
GCC is included in the rootfs at /usr/home/arm-none-linux-gnueabi/
which has been cross compiled to run on this board, as well as a hello world source file at /usr/home/hello-world/
. Compiling the hello world takes a solid 5 or so seconds, a lot more if any sort of optimizations are enabled.
Libraries are also included in /lib
which are required by busybox if busybox isn't statically compiled. GCC has it's own copy of these libraries in /usr/home/arm-none-linux-gnueabi/arm-none-linux-gnueabi/rootfs/
which are used during static compilation by GCC as well as for the dynamic linker. Since gcc was not statically compiled, it also uses the libraries in /lib. The libraries to put in /lib should be copied from the gcc cross-compiler's rootfs/lib/
folder.
Toolchain
Crosstools-NG was indespensable for handling all toolchain issues. While ARM do offer their version of GCC on Launchpad which includes pending additions to mainline GCC, it is preferrable to make a custom cross-compiler to select what standard c library to use. Crosstools-NG can properly compile a cross compiler as well as handling the chicken and egg problem with the c library and compiler, but it takes a solid 30 minutes to make a cross-compiler. Then, using a Canadian Build (cross-native isn't currently supported), a cross-native compiler is compiled with the target tuple being the previously compiled cross-compiler, which takes roughly 45 minutes. In total, compiling a cross compiler and then a cross-native compiler using a normal build and then canadian build respectivly, takes roughly an hour and fifteen minutes.
Resources blob from OneTab
- LegacySAM9N12Page < Linux4SAM < TWiki
- Booting The Linux Kernel | STLinux
- UBootEnvVariables < DULG < DENX
- Booting The Linux Kernel | STLinux
- bootloaders:u-boot:usb Analog Devices Open Source | Mixed-signal and Digital Signal Processing ICs
- A Handy U-Boot Trick | Linux Journal
- KernelBuild - Linux Kernel Newbies
- Ttl/sam_board
- sam_board/sam_board.patch at master · Ttl/sam_board
- U-Boot < Linux4SAM < TWiki
- LegacyU-Boot < Linux4SAM < TWiki
- AT91Bootstrap < Linux4SAM < TWiki
- LegacySAM9N12Page < Linux4SAM < TWiki
- Aria G25 256MB boot problem - Google Groups
- Building U-Boot and Linux 3.11 from scratch for the BeagleBone, and booting
- MT46V32M16P-5B:J - Micron Technology, DRAM Chip. Order from Arrow Electronics.
- Sourcery CodeBench Lite 2014.05-29 for ARM GNU/Linux
- Gentoo Forums :: View topic - VFS: Cannot open root device "sda2" or unknown-block(0,0) ..
- 0x6: Root file system for embedded system - Linux geek's scratchpad
- busybox Inittab
- BusyBox - The Swiss Army Knife of Embedded Linux
- How do I check busybox version (from busybox)? - Unix & Linux Stack Exchange
- android - How to compile Busybox? - Stack Overflow
- BusyBox simplifies embedded Linux systems
- Cross Compiling BusyBox for ARM - BeyondLogic
- Re: [patches] Cross-building instructions
- How to Build a GCC Cross-Compiler
- How To Cross-Compile Clang/LLVM using Clang/LLVM — LLVM 3.7 documentation
- Cross-Compiling for the Raspberry Pi
- Bryan Hundven - Re: Cross compile native gcc for arm with crosstool-ng, have toolchain,
- Linux in Android! DesirAPT is at Beta Test! - Post #5 - XDA Forums
- Crosstool-NG
- crosstool-ng/1 - Introduction.txt at master · crosstool-ng/crosstool-ng
- dayid's screen and tmux cheat sheet
- enable multithreading to use std::thread: operation not permitted arm at DuckDuckGo
- multithreading - C++ Threads, std::system_error - operation not permitted? - Stack Overflow
- c++ - version `CXXABI_1.3.8' not found (required by ...) - Stack Overflow
- embedded linux - When we build a kernel and busy box, we need toolchain only for busybox not for kernel? - Stack Overflow
Getting started
Config file already included, so just running applying the patches and then running make arch=ARM CROSS_COMPILE="put crosscompiler path here"
should suffice.
Using SAM-BA, dump this to dataflash using the burn bootloader option. DO NOT use the save file option, since the burn bootloader option fills in the needed bootloader file length during saving to dataflash used for determinig how much data off dataflash needs to be copied into internal SRAM by the NVM bootloader.
Targets for config
at91sam9n12ekdf_linux_zimage_dt_defconfig
at91sam9n12ekdf_linux_uimage_dt_defconfig
at91sam9n12ekdf_linux_zimage_defconfig
at91sam9n12ekdf_linux_uimage_defconfig
at91sam9n12ekdf_uboot_defconfig
<-- Use this if not using config file.
Dependancies
-
Cross-Compiler: Generate using Crosstools-NG (reccomended since you will have to make a cross-compiler eventually) or download from ARM's launchpad here.
-
AT91bootstrap: The source can be gotten using
git clone git://github.com/linux4sam/at91bootstrap.git
but if the you need the old version which is guaranteed to work with this chip, you can get it off the Linux4sam website using the LegacySAM9N12 page, prepackged in a tar archive. Note, extract the arhive like a pro by doingtar xf t91bootstrap_9n12.tar.gz
which auto determines the archive type.
Changes outline
- Memory
- 64MB vs 128MB: The at91sam9n12ek (at91sam9n12 Eval Kit) is designed with 128 MB DRAM, but we are only using 64 MB, which is reflected in
board/at91sam9n12ek/at91sam9n12ekdf_uboot_defconfig
. - 4 vs 8 Banks + timings: My memory is also from another company which means diffirent timings, as well as due to the smaller size less banks (4 vs 8), both of which are reflected in
board/at91sam9n12ek/at91sam9n12ek.c
. - 100 Mhz vs 133 Mhz: The memory clock which is baisically the main system bus clock was dropped from 133 Mhz to 100 Mhz. I didn't get a chance to really test this, just added it while changing the timings due to potential signal integrity issues, so chances are it's not needed. This is reflected in
board/at91sam9n12ek/at91sam9n12ek.h
. Note, this also drops the core clock from ~400 Mhz to ~300 Mhz.
- 64MB vs 128MB: The at91sam9n12ek (at91sam9n12 Eval Kit) is designed with 128 MB DRAM, but we are only using 64 MB, which is reflected in
- SPI debug
- Debug prompts: Added debug serial prompts saying the current master clock, requested SPI clock, and the current scbr register state. This is reflected in
driver/at91_spi.c
.
- Debug prompts: Added debug serial prompts saying the current master clock, requested SPI clock, and the current scbr register state. This is reflected in
- Main program
- Memory tests: Added memory tests such as alternating patterns, specific patterns, verifying arm exception vectors, and detecting (un)filled memory. These are not all functional tests as of commiting, and will be cleaned up eventually. Reflected in
main.c
- Debug prompts: Added debug prompts to indicate bootloader progress. Reflected in
main.c
- Memory tests: Added memory tests such as alternating patterns, specific patterns, verifying arm exception vectors, and detecting (un)filled memory. These are not all functional tests as of commiting, and will be cleaned up eventually. Reflected in
The above changes are in the patch files, apply them sequentially using git apply patch-file.patch
.
Getting started
Config file already included, so just running applying the patches and then running make arch=ARM CROSS_COMPILE="put crosscompiler path here"
should suffice. You can do make make menuconfig
to select if you want it to be statically or dynamically linked. If dynamically linked, you need to add the approriate librares from your cross compilers sysroot/lib
directory.
Busybox makes it's own inittab if one isn't provided, but it assumes that there are more serial ports than there really are, causing it to spam the serial port with /dev/ttyS# missing
messeges every ~1/4 of a second. Supplying your own like this will prevent that.
touch rootfs/etc/inittab
cat <<'EOF' >> rootfs/etc/inittab
::sysinit:/etc/init.d/rcS
::respawn:/bin/sh
::ctrlaltdel:/sbin/reboot
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::restart:/sbin/init
EOF
Check the root_file_sys directory for more detail and a script which does this for you. Also, this amazing gist shows how to handle a rootfs with busybox, and while it's for the beaglebone the process is nearly indentical for sam9n12 after the kernel steps.
If using a staticly compile busybox then there are no needed shared libraries to copy.
- Simple: Just putting the executable in the rootfs at /bin and calling busybox command
- Correct: Run
make install
which makes a _install dir containing approriate symlinks, allowing you to run commands normally.
Dependancies
-
Cross-Compiler: Generate using Crosstools-NG (reccomended since you will have to make a cross-compiler eventually) or download from ARM's launchpad here.
-
Busybox: You can get the official most recent source from the official repo like so
git clone git://busybox.net/busybox.git
. No modifications are needed.
Just a collection of gists over time I made for this. Some are scripts some are output files. Left here for archival purposes.
Bootloader efforts
U-Boot efforts
Kernel efforts
Rootfs efforts
- Default busybox inittab spamming missing serial port
- Verifying Busybox execution
- Added inittab to Busybox
- Sodoku execution
- Crosstools-NG output
- First GCC attempt
- GCC compiles code on board
Getting started
Config file not included, so do a menuconf to do any required changes (none are needed), and then run the approriate make commands:
make ARCH=arm at91sam9n12ek_defconfig
make ARCH=arm menuconfig
make ARCH=arm CROSS_COMPILE="put crosscompiler path here"
mkimage -A arm -O linux -C none -T kernel -a 20008000 -e 20008000 -n linux-2.6 -d arch/arm/boot/zImage uImage.bin
Put the resulting uimage.bin into the primary partition of the USB flash drive.
The linux4sam linux page is a good resource for more help if needed.
Dependancies
-
Cross-Compiler: Generate using Crosstools-NG (reccomended since you will have to make a cross-compiler eventually) or download from ARM's launchpad here.
-
Linux Kernel: There are a good bit of required changes by atmel in the form of a patch to a very old
2.6.39
kernel, so upgrading to a more modern one is not very possible currently.
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.39.tar.bz2
$ tar xvjf linux-2.6.39.tar.bz2
$ cd linux-2.6.39
- Linux Kernel patch: Atmel require some patches before doing anything, which can handled by doing the following:
wget ftp://ftp.linux4sam.org/pub/linux/2.6.39-at91/2.6.39-at91sam9n12-exp.tar.gz
tar xf 2.6.39-at91sam9n12-exp.tar.gz
patch -p1 < u-boot-9n12_m2.patch
...
patch -p1 < last_patch_file.path
Getting started
Config file not included, so first apply my patch, do a menuconf to verify memory addresses are correct, and then run the approriate make commands:
git apply 0001-Relocating-U-Boot-change-128-to-64MB-USB-boot.patch
make at91sam9n12ek_USB_config
make arch=ARM CROSS_COMPILE="put crosscompiler path here" menuconf
make arch=ARM CROSS_COMPILE="put crosscompiler path here"
Using SAM-BA, dump uboot.bin
to dataflash at 0x8400
using the the save file option.
The linux4sam U-Boot page is a good resource for more help if needed.
Check u-boot.map
to verify that u-boot is expecting itself in the correct place in memory.
Targets for config
at91sam9n12ek_mmc
at91sam9n12ek_nandflash
at91sam9n12ek_spiflash
at91sam9n12ek_USB
<-- Use this if not using config file.
Dependancies
-
Cross-Compiler: Generate using Crosstools-NG (reccomended since you will have to make a cross-compiler eventually) or download from ARM's launchpad here.
-
U-Boot: You can get the new newest U-Boot official source but going to an old git tag is required.
git clone http://git.denx.de/u-boot.git
cd u-boot && git checkout v2011.06 -b yourbranch
Or an archive already at the correct branch and everything.
wget ftp://ftp.denx.de/pub/u-boot/u-boot-2011.06.tar.bz2
tar xf u-boot-2011.06.tar.bz2
- U-Boot patch: Atmel require a patch to U-Boot before doing anything, which can handled by doing the following:
wget ftp://ftp.linux4sam.org/pub/uboot/u-boot-v2011.06/u-boot-9n12_m2.patch
patch -p1 < u-boot-9n12_m2.patch
Changes outline
- Board configuration
- LCD: The LCD functionality was disabled to save code size and remove unnecessary pin I/O state changes.
- 64MB vs 128 MB: There is only 64 MB of DRAM on my board compared to 128MB on the evel kit.
- Memory test: U-Boot automatically tests memory upon bootup, decreased the region memory is tested to just 128 bytes since we already know memory works correctly, to save boot time, and there is only have as much RAM so this had to change anyways.
- U-Boot address (text base): U-Boot by default is inserted into the last megabyte of DRAM via a hardcoded address, but I decided to stick it in
0x20008400
(DRAM memory base is0x20000000
) to not keep mixing up with it's location in dataflash which is at0x8400
, and because the last megabyte for 64MB of DRAM is diffirant than 128MB. Somehow loading the linux kernel into the memory at it's base doesn't overwrite U-Boot, just noticed. Whoops - Sys load address: Where the kernel gets loaded into, changed from
0x22000000
to DRAM base, somehow not crashing U-Boot in the process.
- Boot options
- USB boot option: Properly added to
boards.cfg
andinclude/configs/at91sam9n12ek.h
the ability to boot from USB by starting usb, loading the kernel off USB into memory, and starting the kernel with boot arguments to get the rootfs off USB too.
- USB boot option: Properly added to
The above changes are in the patch file, apply it using git apply patch-file.patch
.
Freescale I.MX233 Embedded Linux System
Status
Currently not working due to issues with DRAM not seeming to be stable. Might have to mess with the timings a good bit more as well as lowering the clock from 133 Mhz to 100 Mhz.
A 22 uF capacitor under the SDHC holder is too tall, causing the SD card to not fully sit in the holder since it can't be closed, so a finger has to be used by pushing the sd card down when in operation.
NOTE Don't bother with the BCB (Boot Control Block) method of booting. It seems that as per the errta, it is bugged for large SD cards (2GB+). To enable booting via normal partitions on the SD card the board has to be booted into recovery mode and opened on the desktop through the bitburner/OTP burner program. Then burn the OTP bits to enable MBR based booting, if you don't burn them then the MCU will always look for a BCB instead of using the MBR.
Resources:
- Koliqi: amazing resource for start to finish for the mx233
- Jancc: somewhat outdated guide on the MX233 but still workable
- Karri: New guide on using the mx233, meh quality
Resources blob from OneTab
- imx233 bootlets and no battery board | Freescale Community
- First Linux board With kicad | LibreCalc
- i.MX233: information about SD/MMC boot from BCB | Freescale Community
- iMX233-OLinuXino - Linux on ARM - eewiki
- IMX233 - Olimex
- All Boards LTIB Config Ubuntu | Freescale Community
- U-Boot for the iMX233-OLinuXino — Christian's Blog
- A new SD card image for the iMX233-OLinuXino — Christian's Blog
- Index of /pub/archlinuxarm/os/
- FTDI PID Unbrick
- embedded - Why would copying a micro SD card using dd fail to produce a bootable card? - Reverse Engineering Stack Exchange
- SD-card with ArchLinux will not boot on Olimex iMX233-OLinuXino-MAXI
- g-lab – u-boot bootloader for imx23-olinuxino board
- Newbie question: can iMx233 Olinuxino-Micro boot from USB?
- dcfldd
- sasamy.narod.ru/IMX23_ROM_Error_Codes.pdf
- i.MX233 board USB not detected on win 7 | Freescale Community
- Booting custom I.MX233 board via BCB | Freescale Community
- iMX233-OLINUXINO SOFTWARE DEVELOPMENT PROGRESS | olimex
- Re: Re: Re: Re: i.MX233 Hand-Held Multimedia Board - Google Groups
- mx233 HTLLC - Google Search
- iMX233-OLinuXino - Linux on ARM - eewiki
- mx233 bcb signature - Google Search
- sasamy.narod.ru/IMX23_ROM_Error_Codes.pdf
Just a collection of gists over time I made for this. Some are scripts some are output files. Left here for archival purposes.