Not quite comprehending all of that, but looking around at capture card setups involves using splitters for video/audio.
edit - and IMO adding the extra audio of the commentary is not that hard, mostly just matching up the audio with the video.
One thing I just thought of when thinking about using the 360 mic is that there could be feedback.