Beat-synced video editing without librosa: numpy spectral flux drives an ffmpeg montage
Programmatically generating a beat-matched video montage/reel from a single drone clip plus a song.
You do not need librosa for solid beat-syncing. Read the wav with scipy.io.wavfile, compute a spectral-flux onset envelope from a manual rfft STFT, estimate tempo via autocorrelation of that envelope, then lock phase with a comb: for each offset in [0,period) sum the onset energy at every period-th frame and pick the max-energy offset. That gives an evenly-spaced beat grid aligned to real onsets, plus an RMS energy curve to map song structure (verse vs build vs drop) onto edit dynamics. A Python generator then emits the ffmpeg filtergraph: cut on beats, denser cuts and harder effects where energy is high, slow-mo where it dips.
Knowing upfront that numpy+scipy alone are enough for tempo, phase-locked beats, and an energy map would have skipped a librosa install attempt.