I can't make out how to connect edge detection with background. Exactly which edge should we take for the background borderline? Can you formalize it? We won't be able to automate borderline recognition if we can't formalize the criteria...
Human interaction in determining the object/background borderline is required because we use a yet different mechanism of human visual pattern recognition based on our real-life experience to determine the borders of the object we wish to crop, or even to reconstruct that border when it becomes invisible against, and thus indistinguishable from, a similarly colored background pattern.
As deducted from your OP, I thought you would like to crop out monochromatic backgrounds with possible gradients within certain tolerances as seen and done in the alpha channel when reconstructing transparency masks for 24-bit images, whereby pixel monochromaticity, albeit artificial, could be a very solid criterion to differentiate between the background and foreground. [0,0] pixel coloration with a given tolerance might be usable too. But
both would fail as soon as you come across similarly colored areas in the foreground object
unless you set forth, and/or reconstruct where necessary, the exact borders within which the discrimination rules are inapplicable because the pixels would in fact belong to the object that you wish to preserve.
You seem to want to generalize the task to fully automatic any-image-against-any-background technique, in which case pattern recognition would be a must and would complicate the task beyond any reason for an indie project. Automated flood filling/color replacement are commonplace but pattern recognition isn't trivial at all. I used to develop red eye removal filters for my graphics effect libraries and I also used to work on a captcha breaker freelance project once for money at the now defunct RentACoder dot com site; I'm familiar with the problems.