Wednesday, March 4, 2020

Broken Onkyo receiver

So my friend had an Onkyo receiver which after a network update broke some of its functions.
Obviously this gave me an excuse to unscrew the receiver, and get inside and see the linux executions inside. I am documenting what I found there, in hope that it may help someone else.

This receiver is pretty old, and is out of warranty. And my friend was not too worried about me breaking it, since it will give him an excuse to go by a new one. The exact model number is TX 515. But I believe the same or similar mechanism / file system is used in most Onkyo receivers released around that time. Since when new these cost a lot, and these file systems are constructed in a very unstable way in general, I do not recommend messing around with these when your receiver is still under warranty.


1. Main processors/components of interest in HDMI board.

Before touching anything I did my usual google searches and found that the service manuals of these units are available, and have good information and schematics. Kudos to the manufacturer to provide that info.

From the service manual and the physical HDMI board, I found that the below components are interesting.
   a. Texas Instruments DA830 DSP
   b. Marvell 88DE2755 VPU

   Both the above SOC has internal ARM (v5 arch), and have serial port access. On further inspection of both boot logs I find that the DA830 boot/filesystem is most useful for basic changes.

On further searching I find that the DA830 is documented well and DaVinci SDK is available for public download. I studied the SDK and felt really confident that there are enough tools available for me to get it back to life, even if in case I find some components have hardware failure.

2. Serial Port

Looking at the board I could identify some candidates for serial port access. But with schematics in hand, it was good to go.

PA3102 connector has DA830 serial port.
      pin3: RX
      pin6: TX
P8030 connector has Marvell VPU serial port
      pin1: +3.3V
      pin2: RX
      pin3: TX
      pin4: GND

3. The issue

By this time I actually almost forgot the issue :) .... welll... it was just an excuse anyway. This TX-515 does not anymore initialize its USB or Network, there by pretty much making those functions totally unusable. He still had the HDMI ports working and have HDMI out etc. But no power in USB ports and no network access.

Some of these boards have soldering issues overtime because of heat. A search find that a lot of such failures are related to the same DSP we were going after. But in this specific case, we had the HDMI components working and the sound output was there. Which means the DSP does infact boot. So there is a good chance that some of the filesystem is corrupted and not getting all the required modules intialized.

By the for folks who are interested in the DSP solder failure issue (which is indicated by no sound when system is cold), there are tons of information available online. So I am not documenting that here. But the second issue my friend had does not had any clearly documented analysis or fix.

4. First thing first, got my USB serial port connected to analize Davinci DSP boot (PA3102 connector).

Find what is happening

5. U-Boot

Uboot is loading properly, and from log I see that the NAND chip used is
NAND device: Manufacturer ID: 0x98, Chip ID: 0x76 (Toshiba NAND 64MiB 3,3V 8-bit)

And it is not a dumped down uboot. I was planning to flash the full uboot if this was a dumped down version of uboot before further debugging. But seems like I don't have to. All the needed bells are there. I can do tftpboot/usbboot etc without even touching the existing file systems. That is promising.

U-Boot > help
?       - alias for 'help'
askenv  - get environment variables from stdin
autoscr - run script from memory
base    - print or set address offset
boot    - boot default, i.e., run 'bootcmd'
bootd   - boot default, i.e., run 'bootcmd'
bootm   - boot application image from memory
bootp- boot image via network using BootP/TFTP protocol
cmp     - memory compare
coninfo - print console devices and information
cp      - memory copy
crc32   - checksum calculation
dhcp- invoke DHCP client to obtain IP/boot params
echo    - echo args to console
exit    - exit script
fatinfo - print information about filesystem
fatload - load binary file from a dos filesystem
fatls   - list files in a directory (default /)
forceenv  - force set environment variables
go      - start application at address 'addr'
help    - print online help
iminfo  - print header information for application image
imxtract- extract a part of a multi-image
itest- return true/false on integer compare
loadb   - load binary file over serial line (kermit mode)
loads   - load S-Record file over serial line
loady   - load binary file over serial line (ymodem mode)
loop    - infinite loop on address range
md      - memory display
mdc     - memory display cyclic
mii     - MII utility commands
mm      - memory modify (auto-incrementing)
mtest   - simple RAM test
mw      - memory write (fill)
mwc     - memory write cyclic
nand    - NAND sub-system
nboot   - boot from NAND device
nfs- boot image via network using NFS protocol
nm      - memory modify (constant address)
ping- send ICMP ECHO_REQUEST to network host
printenv- print environment variables
rarpboot- boot image via network using RARP/TFTP protocol
reset   - Perform RESET of the CPU
run     - run commands in an environment variable
saveenv - save environment variables to persistent storage
saves   - save S-Record file over serial line
setenv  - set environment variables
sleep   - delay execution for some time
test    - minimal test like /bin/sh
tftpboot- boot image via network using TFTP protocol
usb - USB sub-system
usbboot - boot from USB device
version - print monitor version

Now checking the boot parameters I see that the kernel is at offset 0x454000

U-Boot > printenv
baudrate=115200
bootfile="uImage"
verify=n
ethaddr=00:09:b0:cd:a8:7a
bootargs=console=ttyS1,115200n8 root=/dev/mtdblock8 rw quiet lpj=999424 rootfstype=yaffs mem=56M ip=off
setparam=ok
bootcmd=nboot.jffs2 0xc0200000 0 0x454000
autostart=yes
bootdelay=1
dspboot=yes
stdin=serial
stdout=serial
stderr=serial
ver=U-Boot 1.3.3-svn1998 (Mar  5 2012 - 08:10:29)

Environment size: 356/16380 bytes

6. I continued the boot and immediately found the basic issue. The netapp binary keeps crashing, and since it is called with a restart on crash, it keeps trying and crashing.

7. So I went back to uboot. Changed the bootargs temporarily and booted for single user access. This mounted /dev/mtdblock8 as root file system. Checked the service /etc/init.d/netapp

Before further experimenting with netapp, I manually tried loading the important services and module loading followed by netapp, and found that the USB power was up and NET was up. So it was just the netapp application crashing causing the boot to halt and not let other services up.

8. Now let us dig a bit more deep into the netapp service... and try to figure out why is it crashing.. 
If needed I was ready to recompile a new netapp binary.

I see that the netapp binary is called from /opt/onkyo/avr/bin/netapp
But before that it does mount a few filesystems. The most interesting one for us the mounting of /tmp/squashfs followed by a loop mount of /opt

First it mounts a yaffs file system from a variable optfs defined in /etc/boot.properties
This essentially should leads to mounting /dev/mtdblock9 as /mnt/squashfs . Please note that /mnt is linked to /tmp. Followed by loop mounting of /opt using a file in /mnt/squashfs/da83x_rootfs_opt.img

Interestingly enough based on existence of a file /etc/bootchange it mounts a different partition as /mnt/squashfs. /dev/mtdblock10 in this case.

Eventually it executes the netapp binary from 
/opt/onkyo/avr/bin/netapp

In the current system the /etc/bootchange was available and so it was using mtdblock10 partition. From the name I guess this was some kind of a messup during some firmware update process. So out of curiosity I mounted /dev/mtdblock9 followed by loopmount of the img file in there, and see that md5sum of that netapp differs from the one from mtdblock10 loopfile.

Executing that newly found netapp binary seems to not crash.


9.  So the final FIX....
mount /dev/mtdblock8 as writable and delete /etc/bootchange file. and reboot...

Thats it......This seems easier that expected. I was infact a bit disappointed, since there is almost no challenge :) . Too straight forward. Almost too boring to answer to my friend when he ask me what I had to do to fix it.

mount -o remount,rw /
rm -f /etc/bootchange
mount -o remount,ro /

powercycle.


10. Final thoughts...

So just in case if you could not find the correct files in /dev/mtdblock9 you can always get it from downloading the firmware update files from web. Someone else already clearly documented on how to get the file systems extracted from the onkyo firware update files downloaded.

Also you could manually load all the modules, and once USB/NET is available you can also do a clean firmware update to get things back to stable.

Also I was able to compile linux from SDK and so it is possible to replace with custom kernels. But since it is not a device I own, and I don't have any further excuse to mess around with this, I am stopping it here and returning the fully functional receiver back to my friend.

There were 3 binaries from the optware of ONKYO which gained my attention.

1. Audioappe : This is the main audio processing app. You need this to have ANY sound output.

2. spid : This app handles the peripherals communications. You need this for HDMI communications.

3. netapp : This is needed for the higher level communications like Spotify Connect/ Firmware Update communications, DLNA etc. This application is very resource heavy and use a lot of CPU. If you mostly listen to Audio without using the inbuilt NET function of the receiver, it maybe worth to disable this app for most usage. But this binary internally calls the module loader. But modules can be loaded separately via /etc/init.d/modloader function call.

At some point I will create an HTML api to disable netapp and still load modules like USB driver etc.
So when you need spotify/DNLA/fwupdate functions you can just re-enable the netapp via the HTML. I will do that later and when I do it I will publish the details in another page.

I believe this disabling of netapp, would help with fixing the overheating issues of the DA830 soc, since it reduces the cpu load drastically.