Actuarial Outpost
 
Go Back   Actuarial Outpost > Exams - Please Limit Discussion to Exam-Related Topics > SoA/CAS Preliminary Exams > Exam PA: Predictive Analytics
FlashChat Actuarial Discussion Preliminary Exams CAS/SOA Exams Cyberchat Around the World Suggestions

DW Simpson International Actuarial Jobs
Canada  Asia  Australia  Bermuda  Latin America  Europe


Reply
 
Thread Tools Search this Thread Display Modes
  #31  
Old 03-19-2020, 02:35 AM
ambroselo ambroselo is offline
Member
SOA
 
Join Date: Sep 2018
Location: Iowa City
College: University of Iowa
Posts: 251
Default

Quote:
Originally Posted by mnm4156 View Post
In chapter 4 (pg 255) why is it that we are not converting the binary target variable which is treated as an integer into a factor like we do for the predictors agecat and veh_age? I see that we apply the factor() function to clm when creating graphs, but why not use the as.factor from the start?

This method contradicts what is done in chapter 5 (pg 313) where we convert binary target variable to a factor (using as.factor()) at the beginning of the case study during the data prep tasks.

I'm just trying to understand why we would or would not convert target binary variables to factors from the beginning, rather than using factor() function when calling out the target variable (like when plotting graphs in chapter 4).

Also, on a side note - I think you need a set.seed for chunk #18 of you rmd file for 5.3? Your code is generating different results from the manual and cross validation is being used so a set seed is needed
In Section 4.3, you can choose to or not to convert clm into a factor.
  • If you do not, then the summary() and mean() functions in CHUNKs 1 and 10 will return the proportion of observations whose clm value is 1 as the mean, as in the manual.
  • If you do, then the summary() function will show you the relative frequencies of "0" and "1", but the mean() function does not work. You will need commands like table(train$clm)/nrow(train) and table(test$clm)/nrow(test) to get the claim occurrence rate.
Whether or not you convert clm into a factor will have a minor effect on the results produced by the createDataPartition() function (try this), but holding the training and test data fixed, the nature of clm will not affect any GLM you fit.

I reran my code for CHUNK 18 in Section 5.3 and got exactly the same results as those in the manual. Cross-validation is indeed used in CHUNK 18, but it will make use of the random seed that was last called, i.e., set.seed(1) in CHUNK 14. If you run the code in Section 5.3 in order, your results should match those in the manual.
Reply With Quote
  #32  
Old 03-23-2020, 10:34 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 130
Default

on p264, in Chunk 9, the y axes of the boxplots are limited to -1 to 1, without comment.

What is interesting is that the interaction of CONVT with log_veh_value on clm looks different without limits, and also with different limits.

I had previously thought the function ylim() was like a "zoom" into the graph, but it actually removes those observations outside the limits, fundamentally changing the appearance of the boxplots.

Do you see the same thing? Can you confirm?
Reply With Quote
  #33  
Old 03-24-2020, 05:41 AM
ambroselo ambroselo is offline
Member
SOA
 
Join Date: Sep 2018
Location: Iowa City
College: University of Iowa
Posts: 251
Default

Quote:
Originally Posted by Yossarian View Post
on p264, in Chunk 9, the y axes of the boxplots are limited to -1 to 1, without comment.

What is interesting is that the interaction of CONVT with log_veh_value on clm looks different without limits, and also with different limits.

I had previously thought the function ylim() was like a "zoom" into the graph, but it actually removes those observations outside the limits, fundamentally changing the appearance of the boxplots.

Do you see the same thing? Can you confirm?
Good observation. With the ylim command, we are in effect looking at the conditional distribution of log_veh_value (given that it is between -1 and 1). In CHUNK 9, it is a good idea to ignore the "ylim(-1, 1)" command.

By the way, the December 2019 PA exam papers and solutions have been posted.
Reply With Quote
  #34  
Old 03-24-2020, 02:26 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 130
Default

Thanks - dreading looking at it again. But I'll get around to it after I've gone through the manual again.

I wished I had used your "3 minutes per point" rule, I wasted too much time on low point tasks.
Reply With Quote
  #35  
Old 03-24-2020, 10:03 PM
ActuaryStudent22 ActuaryStudent22 is offline
SOA
 
Join Date: Feb 2020
Posts: 22
Default

Quote:
Originally Posted by ambroselo View Post
Good observation. With the ylim command, we are in effect looking at the conditional distribution of log_veh_value (given that it is between -1 and 1). In CHUNK 9, it is a good idea to ignore the "ylim(-1, 1)" command.

By the way, the December 2019 PA exam papers and solutions have been posted.
It may be useful to note that we can look at the unconditional distribution with the
following code: coord_cartesian(ylim = c(-1, 1)).
Reply With Quote
  #36  
Old 03-24-2020, 10:49 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 130
Default

Thanks. This is more of the "zoom" function I was expecting.

This is also probably why I got a 5; I overly bother myself with this minutia, among other habits detrimental to passing ...
Reply With Quote
  #37  
Old 03-31-2020, 04:43 PM
Sader Sader is offline
Member
SOA
 
Join Date: Feb 2019
Posts: 30
Default glmnet for 3.5

I'm on section 3.4.4 where we use the glmnet function. When I try and install glmnet, R tells me that "package ‘glmnet’ is not available (for R version 3.5.0)". How to I use glmnet in the 3.5 environment that will be used on the PA exam?
Reply With Quote
  #38  
Old 03-31-2020, 05:39 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 130
Default

Can you copy-paste the code you are using to install the package?
Reply With Quote
  #39  
Old 03-31-2020, 07:27 PM
Sader Sader is offline
Member
SOA
 
Join Date: Feb 2019
Posts: 30
Default

Quote:
Originally Posted by Yossarian View Post
Can you copy-paste the code you are using to install the package?
I used these SOA instructions to install the frozen version
https://cdn-files.soa.org/e-learning...zenVersion.pdf

I tried using the code to install the glmnet package:
install.packages("glmnet")

which then prompts
package ‘glmnet’ is not available (for R version 3.5.0)
Reply With Quote
  #40  
Old 03-31-2020, 08:17 PM
Yossarian Yossarian is offline
Member
SOA
 
Join Date: Jun 2011
Location: SoCal
Posts: 130
Default

Quote:
Originally Posted by Sader View Post
I used these SOA instructions to install the frozen version
https://cdn-files.soa.org/e-learning...zenVersion.pdf

I tried using the code to install the glmnet package:
install.packages("glmnet")

which then prompts
package ‘glmnet’ is not available (for R version 3.5.0)
The install code is right - can you install other packages, like "MASS"?
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


All times are GMT -4. The time now is 05:54 PM.


Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
*PLEASE NOTE: Posts are not checked for accuracy, and do not
represent the views of the Actuarial Outpost or its sponsors.
Page generated in 0.22800 seconds with 12 queries