Love ggwave! I used it on a short film set a few years ago to automatically embed slate information into each take and it worked insanely well.
If anyone wants details: I had a smartphone taped to the back of the slate with a UI to enter shot/scene/take and when I clicked the button it would transmit that information along with a timestamp as sound. This sound was loud enough to be picked up by all microphones on set, including scratch audio on the cameras, phones filiming BTS, etc.
In post-production, I ran a script to extract this from all the ingested files and generate a spreadsheet. I then had a script to put the files into folders and a Premiere Pro script to put all the files into a main and a BTS timeline by timestamp.
Yes, timecode exists and some implementations also let you add metadata, but we had a wide mix of mostly consumer-grade gear so that simply wasn't an option.
This is a smart solution. We need one of those digital slates with this built-in now.
bfors 62 days ago [-]
Very cool solution!
bjpirt 62 days ago [-]
One of the nicest data through sound implementations I came across was in a kid's toy (often the best source of innovation)
It was a "Bob the Builder" play set and when you wheeled around a digger, etc the main base would play a matching sound. I immediately started investigating and was impressed to see no batteries in the movable vehicles. I realised that each vehicle made a clicking sound as you moved it and the ID was encoded into this which the base station picked up. Pretty impressive to do this regardless of how fast the vehicle was moved by the child.
stavros 62 days ago [-]
Was it based on the frequency of the click?
sejje 62 days ago [-]
>Pretty impressive to do this regardless of how fast the vehicle was moved by the child.
Probably not, eh?
stavros 62 days ago [-]
Probably yes, because the frequency of a note doesn't change based on how quickly the next note is played after it.
sejje 62 days ago [-]
Guess I misunderstood. The first time you said "frequency of the click" -- I would personally respond with clicks per second.
"Frequency of the note" in your next comment clears it up. It probably was that, you're right.
nomel 63 days ago [-]
The acoustic modem is back in style [1]! And, of course, same frequencies (DTMF) [2], too!
DTMF has a special place in the phone signal chain (signal at these frequencies must be preserved, end to end, for dialing and menu selection), but I wonder if there's something more efficient, using the "full" voice spectrum, with the various vocoders [3] in mind? Although, it would be much crepier than hearing some tones.
This isn't DTMF. It's a form of MFSK like DTMF, but it operates on different frequencies and uses six tones at once vs DTMF's two.
nomel 59 days ago [-]
Then I would be very curious to see if this works with aggressive vocoders, like some VOIP use internationally.
Gracana 56 days ago [-]
I would love to hear that recording, haha.
pjc50 62 days ago [-]
> it would be much crepier than hearing some tones.
Hatsune Miku at the speed of a horserace commentator.
(the "vocaloids" are DAW plugins made from chopped up recorded phonemes; Hatsune Miku is voiced by Saki Fujita. Still sounds very inhuman)
bigiain 62 days ago [-]
I'm wondering if shifting frequency chirps like LORA uses would work in audio frequencies? You might be able to get the same sort of ability to grab usable signal at many db below the noise, and be able to send data over normal talking/music audio without it being obvious you're doing so. (I wanted to say "undetectably", but it'd end up showing up fairly obviously to anyone looking for it. Or to Aphex Twin if he saw it in his Windowlicker software...)
nomel 62 days ago [-]
The issue is the (many) vocoders along the chain remove anything that don't match the vocal patterns of a human. When you say hello, it's encoded phonetically to a very low bitrate. Noise, or anything outside what a human vocal cord can do, is aggressively filtered or encoded as vocal sounding things. Except for DTMF, which must be preserved for backwards compatibility. That's why I say it would be creepy to do something higher bitrate...your data stream would literally and necessarily be human vocal sounds!
_def 62 days ago [-]
Data exfiltration via bird
genewitch 62 days ago [-]
Yes. JT8 / FT8, wspr, and then the entirety of fldigi.
To get started.
If you need more speed you need to convince me you won't abuse my ham spectrum but winlink, pactor, and some very slick 16QAM modems exist. 300baud to 128kbit or so.
blensor 62 days ago [-]
I love GGWave. We've been using it in our VR game to automatically sync ingame recordings with an external camera.
At the beginning of the recording it plays the code "xrvideo" which in the second stage of merging the video it looks for the tag in both streams and matches them up
nickcw 62 days ago [-]
It sounds quite nice.
It is also about the same bitrate as RTTY which was invented in 1922 and is still in use by radio amateurs round the world.
The amateur radio community is chock full of innovation for low bandwidth weak signal decodable comm protocols.
There's also V.xx modem standards that are kinda dependent on the characteristics of the phone lines, but might work for audio at a distance?
kurisufag 62 days ago [-]
ham optimizes for the wrong thing, imo. look at ft8: perfect for making contacts at low power with stations far, far away, but really only tuned to the particular task of making contacts.
you can package some text alongside, but fundamentally all amateur operators are looking for is a SYN / ACK with callsigns.
lhamil64 62 days ago [-]
There's also JS8call which is a modified version of FT8 meant for actual communication. IIRC you can do some neat things with it, like relaying a message through another user if you don't have a direct path to the recipient.
the-angry-dome 62 days ago [-]
As one of the accursed hams, I wonder what ggwave's propagation profile would be compared to RTTY / CW (Morse code) etc. Would be interesting to try it out.
tdeck 61 days ago [-]
RTTY is the sound of "satellites" in a lot of media.
vodou 62 days ago [-]
This is cool! Some of Teenage Engineering's Pocket Operators, at least PO-32 [1], uses a data-over-sound feature.
Does Ggwave use a simple FSK-based modulation just because it "sounds good"? Would it be possible to use a higher order modulation, e.g., QPSK, in order to achieve higher speeds? Or would that result in too many uncorrectable errors?
it is a software modem using FSK, but i don't know anything else about it. I am annoyed because i could have had this idea; i'm a HAM who really only cares about "Digital Modes", and have software modems capable of isdn speeds over "AF"
knowaveragejoe 62 days ago [-]
That's really neat! I realize this demo is a contrived setup, but it is basically an example of what Eric Schmidt was talking about when agents start communicating in ways we can't understand.
whalesalad 62 days ago [-]
Yeah I watched this last night and immediately thought of skynet and how dystopian the world could become in the next few years/decades.
jancsika 62 days ago [-]
There was a research paper on doing data-over-sound with sounds that were designed to be pleasing to humans.
The demos sounded like little R2D2 blips and sputters.
Perhaps a researcher for Microsoft or something.
Anyone know the paper I'm talking about? I can't find it.
In the spirit of abusing an error correction mechanism for aesthetics (see: QR codes with pictures in them, javascript without semicolons) could you do that here? How much abuse can the generated signal take?
Just listening to the samples here they're really not that far off. Could probably use a little softening at the edges on the higher tones but it's nowhere near as unpleasant as it could be.
nimish 62 days ago [-]
Acoustic couplers are back baby! Who's up for Phreaking AI?
I remember discovering ggWave few years ago, before the rebrand, it's still the only working( and fastest verifiable) library that can transmit data over sound.
I could not get to work on a project using this then, because of college. But now I am integrating this in my startup for frictionless user interaction. I want to thank the creators and contributors of GGWave for doing all the hard work for these years.
If I find something to improve I'd like to contribute to the codebase too.
megadata 62 days ago [-]
Wasn't there a Google project, Chirp or something? Did this over speakers and microphones? That seems to have disappeared.
This is also how modems used to work, for the young'uns who do not know this.
genewitch 63 days ago [-]
>This is also how modems used to work
they still do, but they used to too.
philsnow 62 days ago [-]
All kinds of modems use this kind of scheme as well, PSK is too low-bandwidth for modern needs so everything is QAM these days. DOCSIS specifies I think QAM-256. Inter-datacenter fiber links use "modems" as well.
genewitch 62 days ago [-]
yes and also soundcard modems: https://i.imgur.com/8mhB4u7.png QAM16 over a PC soundcard into a radio. It's enough bandwidth to stream video between VLC instances. not "slow scan TV", either, fast scan.
Uh, don't try and find this if you're going to use it to pollute the spectrum i am licensed for.
codetrotter 62 days ago [-]
Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?
reaperducer 62 days ago [-]
Outside of hobbyists that do it for fun, and maybe some data centers using it as an out-of-band means of access, is anyone still using dial-up?
I use it to connect to a Windows machine that runs a large piece of machinery in a remote location.
My dry cleaner's credit card reader, too.
dmitrygr 62 days ago [-]
Many aviation fuel pumps in far-out-of-the-way airports use dial-up to authenticate credit cards swiped to pay for the fuel.
flyinghamster 62 days ago [-]
There might still be credit card terminals using 300 bps Bell 103 (which has a short set-up time due to its lack of training sequences).
1200 bps V.23 and Bell 202 are still in use in radio telemetry applications.
Evidlo 62 days ago [-]
Also see Andflmsg. It supports more modulation schemes than just FSK and you can use it as a modem for your HAM radio.
If you're interested in using GGWave in Python, check out ggwave-python, a lightweight wrapper that makes working with data-over-sound easier. You can install it with pip install ggwave-python or pip install ggwave-python[audio], or find it on GitHub: https://github.com/Abzac/ggwave-python.
It provides a simple interface for encoding and decoding messages, with optional support for PyAudio and NumPy for handling waveforms and playback. Feedback and contributions are welcome.
iszomer 62 days ago [-]
I guess this was discussed in some fashion, ~16h ago..
> Bonus: you can open the ggwave web demo https://waver.ggerganov.com/, play the video above and see all the messages decoded!
I could not get this to work unless I played the video on one device and opened it on another. While trying to get it to work from my MBP, waver's spectrum view didn't really show much of anything while the video was playing. Is this the mac filtering audio coming into the microphone to reduce feedback?
ssfrr 62 days ago [-]
Does it work with separate browsers on the same machine? Not sure but I’d guess this sort of filtering would be more common on the browser than the OS
If anyone wants details: I had a smartphone taped to the back of the slate with a UI to enter shot/scene/take and when I clicked the button it would transmit that information along with a timestamp as sound. This sound was loud enough to be picked up by all microphones on set, including scratch audio on the cameras, phones filiming BTS, etc.
In post-production, I ran a script to extract this from all the ingested files and generate a spreadsheet. I then had a script to put the files into folders and a Premiere Pro script to put all the files into a main and a BTS timeline by timestamp.
Yes, timecode exists and some implementations also let you add metadata, but we had a wide mix of mostly consumer-grade gear so that simply wasn't an option.
I posted a short demo video on Reddit at the time, but it got basically no traction: https://www.reddit.com/r/Filmmakers/comments/nsv3eo/i_made_a...
It was a "Bob the Builder" play set and when you wheeled around a digger, etc the main base would play a matching sound. I immediately started investigating and was impressed to see no batteries in the movable vehicles. I realised that each vehicle made a clicking sound as you moved it and the ID was encoded into this which the base station picked up. Pretty impressive to do this regardless of how fast the vehicle was moved by the child.
Probably not, eh?
"Frequency of the note" in your next comment clears it up. It probably was that, you're right.
DTMF has a special place in the phone signal chain (signal at these frequencies must be preserved, end to end, for dialing and menu selection), but I wonder if there's something more efficient, using the "full" voice spectrum, with the various vocoders [3] in mind? Although, it would be much crepier than hearing some tones.
[1] Touch tone based data communication, 1979: https://www.tinaja.com/ebooks/tvtcb.pdf
[2] touch tone frequency mapping: https://en.wikipedia.org/wiki/DTMF
[3] optimized encoders/decoders for human speech: https://vocal.com/voip/voip-vocoders/
Hatsune Miku at the speed of a horserace commentator.
(the "vocaloids" are DAW plugins made from chopped up recorded phonemes; Hatsune Miku is voiced by Saki Fujita. Still sounds very inhuman)
To get started.
If you need more speed you need to convince me you won't abuse my ham spectrum but winlink, pactor, and some very slick 16QAM modems exist. 300baud to 128kbit or so.
At the beginning of the recording it plays the code "xrvideo" which in the second stage of merging the video it looks for the tag in both streams and matches them up
It is also about the same bitrate as RTTY which was invented in 1922 and is still in use by radio amateurs round the world.
Here is what that sounds like
https://youtu.be/wzkAeopX7P0?si=0m0urX7sDp6Jojqe
Not as musical but quite similar
There's also V.xx modem standards that are kinda dependent on the characteristics of the phone lines, but might work for audio at a distance?
you can package some text alongside, but fundamentally all amateur operators are looking for is a SYN / ACK with callsigns.
Does Ggwave use a simple FSK-based modulation just because it "sounds good"? Would it be possible to use a higher order modulation, e.g., QPSK, in order to achieve higher speeds? Or would that result in too many uncorrectable errors?
[1] https://teenage.engineering/products/po-32
it is a software modem using FSK, but i don't know anything else about it. I am annoyed because i could have had this idea; i'm a HAM who really only cares about "Digital Modes", and have software modems capable of isdn speeds over "AF"
The demos sounded like little R2D2 blips and sputters.
Perhaps a researcher for Microsoft or something.
Anyone know the paper I'm talking about? I can't find it.
https://github.com/quiet/quiet-js
Remember seeing them quite a bit a few years ago.
Just listening to the samples here they're really not that far off. Could probably use a little softening at the edges on the higher tones but it's nowhere near as unpleasant as it could be.
This rules.
https://github.com/PennyroyalTea/gibberlink
I could not get to work on a project using this then, because of college. But now I am integrating this in my startup for frictionless user interaction. I want to thank the creators and contributors of GGWave for doing all the hard work for these years.
If I find something to improve I'd like to contribute to the codebase too.
https://audioxpress.com/news/data-over-sound-pioneer-chirp-a...
Seems to have been euthanized.
This is also how modems used to work, for the young'uns who do not know this.
they still do, but they used to too.
Uh, don't try and find this if you're going to use it to pollute the spectrum i am licensed for.
I use it to connect to a Windows machine that runs a large piece of machinery in a remote location.
My dry cleaner's credit card reader, too.
1200 bps V.23 and Bell 202 are still in use in radio telemetry applications.
: https://sourceforge.net/projects/fldigiiles/AndFlmsg/
It provides a simple interface for encoding and decoding messages, with optional support for PyAudio and NumPy for handling waveforms and playback. Feedback and contributions are welcome.
- GibberLink [AI-AI Communication] | https://news.ycombinator.com/item?id=43168611
I could not get this to work unless I played the video on one device and opened it on another. While trying to get it to work from my MBP, waver's spectrum view didn't really show much of anything while the video was playing. Is this the mac filtering audio coming into the microphone to reduce feedback?
Perhaps a lesson from Ron McCroby would be a start: https://m.youtube.com/watch?v=baEoyXoDVc4
pfft, it may even have multiple channels one over another, so one can tune to one or another (if knows how to decode)..