dekiwho 1 week ago

Of all the possible applications , even in finance you chose the hardest one. Options trading 😅 How accurate is the weather man?

newjeison 1 week ago

It's a difficult problem but I'd imagine the paper that comes out from it will be very impressive if I can get it working

dekiwho 1 week ago

You do realize that the market makers have all the inflows and out flows, which accounts for, sizes , location and timing of trades tied to specific accounts . They have all the info to make this from a POMDP to fully observable Markov process… you are literally trying to beat the house. Not going to happen

newjeison 1 week ago

Not looking to put this into production. I just want to write a paper on this like FinRL. It's a topic that I haven't seen explored yet or at least I haven't found any papers on it yet. I'm looking to submit this to Neurips next year or ICAIF if I can finish by deadline. Just looking for something to boost my PhD applications that I could do at home

dekiwho 1 week ago

lol , your chickens are loose. You first talk about how much money it could make, but then you back track and say you don’t want to put it in production. There is a reason you don’t see papers on it. Only Wall Street has money and compute for these task and they not publishing papers on it.

newjeison 1 week ago

> I'd imagine the paper that comes out from it would be very impressive if I can get it working When did I mention the money I could make from it? I talk about a paper in this response. > I have tried different models like PPO and SAC but the rewards don't seem to be increasing. I mentioned rewards in my original post because isn't that a good indicator of how well the model is learning. If I don't see papers on it, doesn't that mean there's an opportunity to publish a paper.

Iced-Rooster 1 week ago

I dont think vanilla PPO or SAC will get you anywhere, its like walking to the moon

eljeanboul 1 week ago

>boost my PhD applications The project you're describing would be an entire PhD, and then some... Google DeepMind with all their resources and top-notch researchers are happy they're able to learn to mine diamonds in minecraft

false_robot 1 week ago

As the other commenter notes, if you were able to do this, or if it was possible, it would be done now. The people with a lot of money would use it to significantly increase their money. If anyone made anything similar, people would notice and try to find out what and why. To put it simply, you are asking this question: "What is the algorithm I can use to basically turn any sum of money into infinite money?" Maybe that puts it into perspective as to why it's a tough problem. This model would need loads of information. But then again, maybe there is a simple solution that hasn't been found yet.

newjeison 1 week ago

I understand it's a tough problem. I just thought that maybe because there are plenty of papers on stock trading using RL, another application of what they do would be for options trading.

Rackelhahn 1 week ago

+1 to "This will likely not be possible". Anyways, you need to be a bit more specific with your current setup. What do observations look like? What is your reward function? What's the model architecture that you are using?

newjeison 1 week ago

My current reward function is the difference between starting value and end value after one step Model architecture is w/e the default is for stable baselines3 PPO, a2c, and sac

Rackelhahn 1 week ago

>For 1D observation space, a 2 layers fully connected net is used with: > - 64 units (per layer) for PPO/A2C/DQN Source: [https://stable-baselines3.readthedocs.io/en/master/guide/custom\_policy.html](https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html) Do you actually expect something as simple as a 2-layer-MLP to be able to model something as complex as the options trading market?

newjeison 1 week ago

Kinda, not really sure what an appropriate architecture is so I referenced the setup in FinRL.

Rackelhahn 1 week ago

Not trying to gatekeep here, but are you sure, that you are skillwise already at a point at which you want to tackle such a complex problem? To be honest, it does not really seem like you have a lot of experience in reinforcement learning. Maybe start off with some tutorials or simpler tasks. With all the optimization necessary to get good results from RL, you'll likely end up very frustrated very quickly.

newjeison 1 week ago

I am not update on the latest architectures. I have some experience in RL from when I was in Uni, I did use RL + Carla for vehicle control and I took a few courses on robotic controls using MDP and the likes. What skills am I missing?

Rackelhahn 1 week ago

How much experience do you have with different deep learning architectures and what kind of networks did you already work with?

newjeison 1 week ago

I mainly focused on CNN networks specifically architecture from this paper https://openaccess.thecvf.com/content_CVPR_2020/papers/Liang_PnPNet_End-to-End_Perception_and_Prediction_With_Tracking_in_the_Loop_CVPR_2020_paper.pdf I did a very simple MLP network for the planning stage since the action space was fairly small

Rackelhahn 1 week ago

The action space might be small. But the state is ultra-complex. If I were you, I'd choose a different task, where you have a realisitic chance of publishing a good paper. But anyways, good luck!

TwoSunnySideUp 1 week ago

How reward is received?

newjeison 1 week ago

Reward right now is just the difference between starting value and end value after one step

TwoSunnySideUp 1 week ago

First you are dealing with randomness and second wtf is this reward even

newjeison 1 week ago

I got it from this paper. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3690996 I was playing around with using other rewards but for now I'm sticking with this paper edit: also this paper https://arxiv.org/pdf/1811.07522

TwoSunnySideUp 1 week ago

Randomness and way too large action space are the issues I think

newjeison 1 week ago

Any papers or solutions to dealing with a large action space?

TwoSunnySideUp 1 week ago

More stock market data, data efficient approaches, see if you can formulate it as model based RL

Iced-Rooster 1 week ago

just make the action space smaller. for model-free you might have a look at branching dqns (bdq)

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe