Even though we don't hear the poem from the poet-acrobat's voice, we do get a speaker that seems to do a pretty good job of showing us his perspective. So essentially we have a third-person point of view that's limited to the poet himself. Notice we don't hear Beauty's side of the story. We only see her "fair eternal form" but we don't hear her fair eternal thoughts. If we did, the poem would sound a whole lot different, right?
So what does this do to the sound of the poem? If we only get one side of the story and we get a lot of language that stresses that it's all "his" world, maybe we also get the sense that the world kind of revolves around him. After all, everything is "his own making" and everyone is watching him, including the speaker.
But the speaker also has a ringmaster's voice that maintains the fun spectacle of the poem. He tells the poet's story as if he's watching him that very moment, adding to the excitement and circus-like vibe. So he's not trying to be super-serious about it all. After all, we're at a circus and the point of the show is to entertain the audience.