
#1




Thorny GLM Problem
For reasons too strange and dull to go into, I need to fit an additive multivariate GLM (identity link function) to some observed motor insurance premium data which has a gammaesque distribution (always positive, and noticeably skewed).
I’m using R for this. When I try to fit a model with gamma error structure and identity link function, it fails (with error messages), presumably because any time I have a negative linear predictor on any row of data, a red light flashes because the gamma distribution can’t be negative. If I fit a multiplicative model (gamma error structure and log link function), it works fine, because the exp() transformation of the linear predictor ensures it’s nonnegative. But a multiplicative model isn't of use here. Plan B is to use a Gaussian error structure (and identity link  aka normal multiple regression), which won’t have the hiccups with negative linear predictors  but this won’t be as accurate because the data I’m fitting the model to is noticeably skewed. Are there any alternative approaches that I might consider? I'm scratching my head a bit. Thx. 
#2




I know you said it's too dull to go into, but I'm curious why you *need* to fit an additive model. That, for me, is the most interesting part of your post.
Also, are all of your response values not just nonnegative but actually positive? If you have any zero values I think that will cause problems trying to fit a Gamma distribution. 
#3




Briefly on the "why" part: I'm working for a client who has a private motor insurance portfolio. The rating structure for the product is additive rather than multiplicative (the latter is much more common of course). Base rate + loading for age + loading for region + ...etc. where the loadings are fixed monetary amounts.
So a premium calculation (very simple example) might be: base rate $1000 age loading $300 region loading $200 vehicle loading ($400) ================ total premium = $1100 (the currency unit involved isn't actually the US$, but you get the idea) How quaint! It's unusual and I think not very accurate. I'm trying to persuade the client to move to a multiplicative structure, but they're resisting. I've given them a set of multiplicative rates but they've asked me to give them a revised set of additive rates as well, so that they can use that to tweak their existing rating structure (I'd prefer it if they threw it away and started from scratch). My response values are all > 0 yes. Last edited by bigalxyz; 07092020 at 09:45 AM.. 
#4




Incidentally, if I use the "Plan B" I mentioned earlier (which amounts to basic multiple regression), I get the following scatter plot (observed vs fitted) which kind of illustrates the problem. Maybe it's the best I can do though...

#5




You can use a Gaussian locationscale model. That would at least allow you to increase the variance as a function of the predictors, rather than keep it constant. Not sure how well that will work, but that's something I would try if I faced this dilemma.

#6




Can you utilize interactions? That is, the value of one parameter depends on the value of another covariate?
Have you considered some polynomial structures, or some form of power transformation of the target to get "better" results?
__________________
I find your lack of faith disturbing Why should I worry about dying? It’s not going to happen in my lifetime! Freedom of speech is not a license to discourtesy #BLACKMATTERLIVES 
Tags 
glm, p&c, pricing 
Thread Tools  Search this Thread 
Display Modes  

