Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I should have mentioned that any linear transform can be considered to be a single layer neural network

I think this should be turned around. A single layer neural network can be considered a linear mapping (but not necessarily an orthogonal transform or change of basis, like the DFT).

A more clear example of this are adaptive filters, which are trained in real time using gradient descent.

This is an important distinction because thinking of "X is a Neural Net" doesn't provide meaningful insight, whereas "Neural Nets with X properties are a case of linear dynamic systems, here's an example of how we can equate one linear transformation to a neural net" leads you to deeper conclusions on the analysis and synthesis of ANNs in the context of dynamics - which encompasses a much larger surface area than the DFT.



I've sometimes wondered how many signal processing kernels or filter architectures you could learn via SGD given the right data and search strategy. Maybe using something like Neural Architecture Search, so not just the weights / coefficients but the architecture too.

Could it discover an interpolation filter like upfirdn? An IIR Butterworth filter (it would have to be recurrent)? A frequency transfer function derived from two signals?

I imagined that the solutions it found wouldn't be "clean" but have other non-essential operators bloating them. Could it find new architectures that haven't been created from first principles?


I think a big question here is, "what is the goal?" A key distinction is whether the NN is intended to handle the signal directly (the NN is the filter) or if it's just a technique for finding filter coefficients (the NN designs the filter for the signal).

For some tasks, like system identification (used in echo cancellation), the NN is the filter - aformentioned adaptive filters are used for this case right now. It can also be used for black box modeling, which has numerous application in real time or otherwise.

For others like Butterworth (and other classic designs) there's not really a good reason to use a NN. Butterworth (and Chebychev I/II, elliptical, optimum-L, and others) are filter design formulae with a closed form (for a given filter order) that yield roots of transfer functions that have desirable properties - I'm not sure how a learning approach can beat a formulae that are derived from the properties of the filter they design.

There are iterative design algorithms that do not have a closed form, like Parks-McClellan. It is however quite good at what it does - it would be interesting to compare these methods against some design tricks to reduce filter order.

There are some applications of filter design that NNs can do that we don't have good solutions for with traditional algorithms, like system identification, another is in optimization (in terms of filter order) and iteratively designing stable IIR filters to fit arbitrary frequency curves (for FIR it's a solved problem, at significant extra cost compared to IIR).

As for topologies of the filter itself that is an interesting angle. Topologies affect quantization effects (if you wanted an FPGA to realize the filter with the fewest number of bits, topologies matter) and for time variant filters (like an adaptive IIR) there are drastic differences between the behavior of various topologies. I'm not sure how it would manifest with a NN designing the filter.


I meant learning the topology (I called it architecture) from scratch, or minimally from heuristics. What would I discover? Something that we recognize as an IIR filter? Probably not, but if it did, what would it discover that we didn't recognize over years of academic thought but could have from first principles? To me that's the intriguing angle.


I don't think I follow. These designs do come from first principles. In fact the reason we've had so much academic thought put into it is because 100 years ago we designed systems empirically and a few smart people went back to first principles to find the fundamental constraints on filter networks.


That is how we did create such topologies. But if we had not, could they have been discovered through neural architecture search [1]? NAS has been used to find architectures that outperform those that were hand designed. And in RL , Google used a method inspired by NAS [2] to learn methods built by hand (i.e. TD learning, DQN). I know it's not the same but if you use a bit of speculative imagination you can translate what has been done to signal processing applications.

[1] https://en.m.wikipedia.org/wiki/Neural_architecture_search

[2] https://ai.googleblog.com/2021/04/evolving-reinforcement-lea...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: