Sounds like a good project. Your code looks like a good first step. What happens when you run your code? Do you get an error message? If it runs without error, then evaluate the plot of abs(Y) versus frequency: does it look reasonable?
Please attach the audio file file_name.wav, and format your code as "code" in your posting, and run it, so that we can see the plot that you get.
Why do you limit the fft to 1024 points? How long (how many points) is xn? When you limit the fft to 1024 points, the input signal is trucated to 1024 points, if it is longer. This means you are analyzing only the first 1/48th of a second of sound, which is probably less than what you want.
Next step will be to implement the ideal filter, to remove frequencies above 4 kHz. Any ideas for how you could do that?